AutoWSL 2019


Asian Conference on Machine Learning (ACML)2019 will be held in WINC AICHI, Nagoya, Japanfrom November 17 to 19, 2019.AutoWSLis one of the competitions in main conferenceprovided by4Paradigm,ChaLearn,RIKEN and Microsftand co-organized withWSL-workshop@ACML 2019.
UC terminates subscriptions with world's largest scientific publisher in the push for open access to publicly funded research, since “Knowledge should not be accessible only to those who can pay,” said Robert May, chair of UC's faculty Academic Senate. Similarly, machine learning should not be accessible only to those who can pay. Specifically, modern machine learning is migrating to the era of complex models (e.g., deep neural networks), which require a plethora of well-annotated data. Large companies have enough money to collect well-annotated data. However, for startups or non-profit organizations, such data is barely acquirable due to the cost of labeling data (e.g., by crowd-sourcing platforms). Besides, well-annotated may not exist due to the natural scarcity in the given domain (e.g., Alzheimer's diseaseor earthquake prediction). Weakly-supervised learning (WSL) is defined as the collection of machine learning problem settings and algorithms that share the same goals as supervised learning but can only access to less supervised information than supervised learning. The above facts and practical issues motivate us to research and pay attention to weakly-supervised learning (WSL) since WSL does not require such a massive amount of annotated data. What's more, traditional WSL methods have too many hyperparameters to tune depending on the problems, which requires too much human effort to deploy a WSL method successfully.
Here, we propose the first AutoWSL competition which aims at proposing automated solutions for WSL-related tasks. This challenge is restricted to binary classification problems, which come from different application domains. 3 practice datasets are provided to the participants for developing AutoWSL solutions. Afterward, solutions will be evaluated with 18 unseen feedback and final private datasets without human intervention. The results of these datasets determine the final ranking.
In AutoWSL competition, we will focus on three popular tasks in WSL, i.e., semi-supervised learning (some samples are unlabeled), positive unlabeled learning (samples are only positive or unlabeled, no negative samples), and learning from noisy labels (all samples are labeled, but some labels can be wrong). These are three disjoint tasks, and they will NOT simultaneously appear in a single data set. There will be auxiliary information helping participants identify which task they need to perform on each data set.
AutoWSL will pose new challenges to the participants, as listed below: - How to automatically deal with various kinds of WSL tasks? - How to automatically extract useful features for different tasks? - How to automatically handle different amounts of supervised information? - How to automatically design effective learning models to deal with various structured data?
Additionally, participants should also consider: - How to automatically and efficiently select proper hyper-parameters? - How to make the solution more generic, i.e., how to make it applicable for unseen tasks? - How to keep the computational and memory cost acceptable?


Participants should log in ourplatformto start the challenge. Please follow the instructions inplatform [Get Started]to get access to the data, learn the data format and submission interface, and download the starting-kit.

This page describes the datasets used in AutoWSL challenge. 39 speech categorization datasets are prepared for this competition.Three practice datasets, which can be downloaded, are provided to the participants so that they can develop their AutoSpeech solutions offline. Besides that, anotherFifteen validation datasetsare also provided to participants to evaluate the public leaderboard scores of their solutions. Afterward, their solutions will be evaluated withfifteen test datasetswithout human intervention.


This challenge is restricted to binary classification problems, which come from different application domains. we will focus on three popular tasks in WSL, i.e., semi-supervised learning (some samples are unlabeled), positive unlabeled learning (samples are only positive or unlabeled, no negative samples), and learning from noisy labels (all samples are labeled, but some labels can be wrong). These are three disjoint tasks, and they will not simultaneously appear in a single data set.


Each dataset is split into two subsets, namely the training set and the testing set.

Both sets have:
• a main data file that stores the main table (label excluded);
• an info dictionary that contains important information about the dataset, including feature type;
• The training set has an additional label file that stores labels associated with the training data.

Table files

Each table file is a CSV file that stores a table (main or related), with '\t' as the delimiter. The first row indicates the names of features, and the following rows are the records.
The type of each feature can be found in the info dictionary that will be introduced soon.
There are 4 types of features, indicated by "cat", "num", "multi-cat", and "time", respectively:
• cat: categorical feature, an integer
• num: numerical Feature, a real value.
• multi-cat: multi-value categorical Feature: a set of integers, split by the comma. The size of the set is not fixed and can be different for each instance. For example, topics of an article, words in a title, items bought by a user and so on.
• time: time feature, an integer that indicates the timestamp.
Note: Categorical/Multi-value Categorical features with a large number of values that follows a power law might be included.

Label file

The label file is associated only with the main table in the training set. It is a CSV document that contains exactly one column, with the first row as the header and the remaining indicating labels of corresponding instances in the main table. We use 1, 0, and -1 to indicate positive, unlabeled, and negative examples respectively.

Info dictionary

Important information about each dataset is stored in a python dictionary structure named as info.json, which acts as an input of the participants' AutoML solutions. For public datasets, we will provide an info.json file that contains the dictionary. Here we give details about info.
• task:'pu' or 'ssl' or 'noisy'.
• time_budget:the time budget of the dataset.
• start_time:DEPRECATED.
• schema:a dictionary that stores information about table. Each key indicatesthe name of feature, and its corresponding value is a dictionary that indicates the type of each column in this table.

  • Example:

    { "task": "pu", "time_budget": 500, "start_time": 10550933, "schema": { "c_1": "cat", "n_1": "num", "t_1": "time" } }

Dataset Credits

We thank the following recourses for providing us with excellent datasets:
  • Dua, D. and Graff, C. (2019). UCI Machine Learning Repository []. Irvine, CA: University of California, School of Information and Computer Science.


This challenge hasthree phases. The participants are provided withpractice datasetswhich can be downloaded, so that they can develop their AutoWSL solutions offline. Then, the code will be uploaded to the platform and participants will receive immediate feedback on the performance of their method at another eighteenvalidation datasets. Afterfeedback phaseterminates, we will have anothercheck phase, where participants are allowed to submit their code only once onprivate datasetsin order to debug. Participants won't be able to read detailed logs but they are able to see whether their code report errors. Last, in theFinal Phase, Participants' solutions will be evaluated on eighteentest datasets. The ranking in the final phase will count towards determining the winners.

Code submitted is trained and tested automatically, without any human intervention. Code submitted onfeedback (resp. final) phaseis run on all 18 feedback (resp. final) datasets in parallel on separate compute workers, each one with its own time budget.

The identities of the datasets used for testing on the platform are concealed. The data are provided in a raw form (no feature extraction) to encourage researchers to performe automatic feature learning. All problems are binary classification problems. The tasks are constrained by the time budget (provided in the metafile of datasets).

Here is some pseudo-code of the evaluation protocol:

# For each dataset, our evaluation program calls the model constructor: from model import Model M = Model(metadata=dataset_metadata) with timer.time_limit('training'): M.train(train_dataset, train_label) M = Model(metadata=dataset_metadata) with timer.time_limit('predicting'): M.load(temp_dir) y_pred = M.predict(test_dataset)

It is the responsibility of the participants to make sure that neither the "train" nor the "test" methods exceed the time budget.


For each dataset, we computeROC AUCas the evaluation for this dataset. Participants will be ranked according to AUC per dataset. After we compute the AUC for all 18 datasets, the overall ranking is used as the final score for evaluation and will be used in the leaderboard. It is computed by averaging the ranks (among all participants) of AUC obtained on the 18 datasets.

More details about submission and evaluation can be found on theplatform [Get Started - Evaluation].

Terms & Conditions

Please find the challenge rules on theplatform website [Get Started - Challenge Rules].


  • 1st Place: $2,000
  • 2ndPlace: $1,500
  • 3rdPlace: $500


Beijing Time (UTC+8)

  • Sep 24th,2019, 23:59: Beginning of the feedback Phase, the release of practice datasets. Participants can start submitting codes and obtaining immediate feedback in the leaderboard.

  • Oct 15th, 2019, 23:59: Real Personal Identification
  • Oct 22nd, 2019, 23:59: End of the feedback Phase.

  • Oct 23rd, 2019, 00:00: Beginning of the check Phase.
  • Oct 26th, 2019, 19:59: End of the check Phase.

  • Oct 26th, 2019, 20:00: Beginning of the final Phase.
  • Oct 28th, 2019, 20:00: Re-submission deadline.

  • Oct 30th, 2019, 20:00: End of the final Phase.

Note that the CodaLab platform uses UTC time format, please pay attention to the time descriptions elsewhere on this page so as not to mistake the time points for each phase of the competition.




Image result for microsoft logo

- Wei-Wei Tu, 4Pardigm Inc., China, (Coordinator, Platform Administrator, Data Provider, Baseline Provider, Sponsor)

- Isabelle Guyon, Universté Paris-Saclay, France, ChaLearn, USA, (Advisor, Platform Administrator)

- Qiang Yang, Hong Kong University of Science and Technology, Hong Kong, China, (Advisor, Sponsor)

Committee (alphabetical order)

- Bo Han,RIKEN-AIP, Japan, (Admin)

- Hai Wang,4Paradigm Inc., China, (Dataset provider, baseline)

- Ling Yue,4Paradigm Inc., China, (Admin)

-Quanming Yao,4Paradigm Inc., China, (Admin)

- Shouxiang Liu,4Paradigm Inc., China, (Admin)

- Xiawei Guo, 4Paradigm Inc., China, (Admin)

- Zhengying Liu, U. Paris-Saclay; U. PSud, France, (Platform Provider)

- Zhen Xu,4Paradigm Inc., China, (Admin)

Organization Institutes




Previous AutoML Challenges:

-First AutoML Challenge






-AutoCV2@ECML PKDD2019



About 4Paradigm Inc.

Founded in early 2015,4Paradigmis one of the world's leading AI technology and service providers for industrial applications. 4Paradigm's flagship product – the AI Prophet – is an AI development platform that enables enterprises to effortlessly build their own AI applications, and thereby significantly increase their operation's efficiency. Using the AI Prophet, a company can develop a data-driven ''AI Core System'', which could be largely regarded as a second core system next to the traditional transaction-oriented Core Banking System (IBM Mainframe) often found in banks. Beyond this, 4Paradigm has also successfully developed more than 100 AI solutions for use in various settings such as finance, telecommunication and internet applications. These solutions include, but are not limited to, smart pricing, real-time anti-fraud systems, precision marketing, personalized recommendation and more. And while it is clear that 4Paradigm can completely set up a new paradigm that an organization uses its data, its scope of services does not stop there. 4Paradigm uses state-of-the-art machine learning technologies and practical experiences to bring together a team of experts ranging from scientists to architects. This team has successfully built China's largest machine learning system and the world's first commercial deep learning system. However, 4Paradigm's success does not stop there. With its core team pioneering the research of ''Transfer Learning'', 4Paradigm takes the lead in this area, and as a result, has drawn great attention of worldwide tech giants.

About ChaLearn

ChaLearnis a non-profit organization with vast experience in the organization of academic challenges. ChaLearn is interested in all aspects of challenge organization, including data gathering procedures, evaluation protocols, novel challenge scenarios (e.g., competitions), training for challenge organizers, challenge analytics, resultdissemination and, ultimately, advancing the state-of-the-art through challenges.


RIKENis a large scientific research institute in Japan. Founded in 1917, it now has about 3,000 scientists on seven campuses across Japan, including the main site at Wakō, Saitama Prefecture, just outside Tokyo. Riken is a Designated National Research and Development Institute, and was formerly an Independent Administrative Institution. "Riken" is a contraction of the formal name Rikagaku Kenkyūjo, and its full name in Japanese is Kokuritsu Kenkyū Kaihatsu Hōjin Rikagaku Kenkyūsho and in English is the National Institute of Physical and Chemical Research.

About Microsoft

MicrosoftCorporation is an American multinational technology company with headquarters in Redmond, Washington. It develops, manufactures, licenses, supports, and sells computer software, consumer electronics, personal computers, and related services. Its best known software products are the Microsoft Windows line of operating systems, the Microsoft Office suite, and the Internet Explorer and Edge Web browsers. Its flagship hardware products are the Xbox video game consoles and the Microsoft Surface lineup of touchscreen personal computers. As of 2016, it is the world's largest software maker by revenue, and one of the world's most valuable companies. The word "Microsoft" is a portmanteau of "microcomputer" and "software". Microsoft is ranked No. 30 in the 2018 Fortune 500 rankings of the largest United States corporations by total revenue. (Credits: Wikipedia)