Collocated with ICAPS 2021
In a dynamic job shop scheduling problem (DJSSP), limited machine resources are used to produce a collection of jobs, each comprising a sequence of operations that should be processed on a specific set of machines. A job is considered done once the last operation is finished. In addition, stochastic events such as random job arrival, unintended machine breakdown, due time changing, etc., all of which happen frequently in real-world manufacturing, are considered in DJSSPs. In this challenge, participants are invited to design computer programs capable of automatically generating policies(usually in the form of agents) for a collection of DJSSPs. For a given DJSSP, the learned scheduling policy/agent(s) is expected to spatially and temporally allocate machine resources to jobs, with the purpose to finish all the jobs while optimizing some business metrics.
We developed a total of thirteen DJSSP virtual environments (simulators) for this challenge, with which agents are trained and tested. They are based on real-world production lines but are carefully designed to make the problem both challenging and manageable. A uniform interface is shared by all environments, but configurations of machines, jobs, metrics, and events vary among tasks to test the generalization ability of submitted solutions. A practical AutoRL solution should be able to generalize to a wide range of unseen tasks. In this challenge, the participant's solution should be able to automatically train agent(s) for each given DJSSP environment with the purpose to maximize the long-term return, and the evaluation is based on the performance of agents on a collection of different environments.
The competition will be held in three phases: In the Feedback Phase, the participants can upload their solutions and get daily feedback to see how their solution performs on five feedback environments and make improvements. Then, participants need to submit their final solution to Check Phase to test if they can be properly evaluated and each participant is offered one opportunity to make corrections. Lastly, in the final phase, namely the Private Phase, solutions are evaluated by five unseen private environments and are ranked. Participant whose solution has the top average-rank over all the private environments is the winner.
Competition Website: https://www.automl.ai/competitions/14
Apr. 21st, 2021: Beginning of the competition (the Feedback Phase).
Jul. 2nd, 2021: End of the Feedback Phase and beginning of the Check Phase.
Jul. 9th, 2021: End of the Check Phase and beginning of the Private Phase.
Jul. 16th, 2021: End of the Private Phase.
Jul. 23rd, 2021: Announcement of the winner.
Aug. 21st, 2021: Beginning of IJCAI 2021 conference.
AutoRL Challenge - DJSSP
Job shop scheduling is the key problem in many real-world manufacturing systems and has been studied for decades. Dynamic job shop scheduling problem (DJSSP), extending the original problem with commonly encountered stochastic events, is recently attracting academic and industrial attention. Various methods have been proposed to solve DJSSP, ranging from classical operations research-based methods and heuristic methods to meta-heuristic algorithms. Reinforcement Learning (RL), an emergent intelligent decision-making method, is also employed to solve DJSSPs. Considering the nature of DJSSP as a sequential decision making problem that can be converted to a Markov Decision Process (MDP), and the recent successful applications of RL techniques in games and real-world problems, we believe that RL is a promising approach to DJSSP and a much larger improvement could be made based on existing works. However, most aforementioned successes of RL depend on domain expertise(usually both in RL and the business) and enormous computing power. Automatic Reinforcement Learning aims to lower the threshold of RL applications so that more organizations and individuals can benefit. AutoRL tries to train agents with as little human effort as possible. Given an RL problem, an AutoRL method should automatically form the state and actions spaces, generate network architectures (if deep RL methods are employed), select proper training algorithm and hyper-parameters, and finally output agents while taking into consideration of effectiveness and efficiency, in order to train agents with reasonable performances and acceptable costs.
To summarize, the motivations for this DJSSP challenge, are:
• Dynamic job shop scheduling is an important real-world problem;
• RL is a promising solution to DJSSP;
• AutoRL is necessary for the extensive application of RL techniques
To prevail in the proposed challenge, participants should propose automatic solutions that can effectively and efficiently train agents for a given set of DJSSP environments. The solutions should be designed to, with an environment as input, automatically construct the state and action spaces, shape rewards, generate neural network architectures (if deep RL method is used), select RL algorithm and its hyper-parameters, and train agents. Here, we list some specific questions that the participants should consider and answer:• How to improve the in-distribution generalization capability of RL approaches?
• How to design a generic solution that is applicable in unseen DJSSP tasks?
• How to represent the states and actions in DJSSPs so that learning performance and efficiency can be improved?
• How to automatically shape the reward to make the learning more efficient?
• How to automatically generate network architectures for a given task?
• How to automatically choose the RL algorithm and tune its hyper-parameters?
• How to improve data efficiency?
• How to keep the computational and memory cost acceptable?
First Place Prize: 5,000 USD
Second Place Prize: 2,000 USD
Third Place Prize: 1,000 USD
Special Prize: Places 4th and 5th: 500 USD
Wei-Wei Tu, 4Pardigm Inc.
Hugo Jair Escalante, Instituto Nacional de Astrofisica, Optica y Electronica, INAOE, Mexico
Isabelle Guyon, Universte Paris-Saclay, ChaLearn
Qiang Yang, Hong Kong University of Science and Technology
Committee (alphabetical order)Bin Feng, 4Paradigm Inc.
Mengshuo Wang, 4Paradigm Inc.
Tailin Wu, 4Paradigm Inc.
Xiawei Guo, 4Paradigm Inc.
Yuxuan He, 4Paradigm Inc.
Please contact the organizers if you have any problem concerning this challenge.
Previous AutoML Challenges:
About 4Paradigm Inc.
Founded in early 2015, 4Paradigm is one of the world’s leading AI technology and service providers for industrial applications. 4Paradigm’s flagship product – the AI Prophet – is an AI development platform that enables enterprises to effortlessly build their own AI applications, and thereby significantly increase their operation’s efficiency. Using the AI Prophet, a company can develop a data-driven “AI Core System”, which could be largely regarded as a second core system next to the traditional transaction-oriented Core Banking System (IBM Mainframe) often found in banks. Beyond this, 4Paradigm has also successfully developed more than 100 AI solutions for use in various settings such as finance, telecommunication and internet applications. These solutions include, but are not limited to, smart pricing, real-time anti-fraud systems, precision marketing, personalized recommendation and more. And while it is clear that 4Paradigm can completely set up a new paradigm that an organization uses its data, its scope of services does not stop there. 4Paradigm uses state-of-the-art machine learning technologies and practical experiences to bring together a team of experts ranging from scientists to architects. This team has successfully built China’s largest machine learning system and the world’s first commercial deep learning system. However, 4Paradigm’s success does not stop there. With its core team pioneering the research of “Transfer Learning,” 4Paradigm takes the lead in this area, and as a result, has drawn great attention of worldwide tech giants.
ChaLearn is a non-profit organization with vast experience in the organization of academic challenges. ChaLearn is interested in all aspects of challenge organization, including data gathering procedures, evaluation protocols, novel challenge scenarios (e.g., competitions), training for challenge organizers, challenge analytics, resultdissemination and, ultimately, advancing the state-of-the-art through challenges.