Anaconda Slithers Into the Data Science Certification Space
On April 9, Austin, Texas-based Anaconda announced the Anaconda Data Science Certification. The company claims to offer the "most popular Python data science platform," so this new program could generate a lot of interest.
Nobody can dispute the observation that Big Data and data science is a hot growth area, and that demonstrations of technical proficiency provide a great way to gain entry into this high-demand, high-activity portion of the IT jobs market.
Source: Anaconda Data Science Certification
The first credential to be offered is the Anaconda Certified Professional (ACP) - Data Scientist. To earn this certification, candidates must work through seven content modules, pass a unit exam for each one, then pass a comprehensive exam across all topics covered in that curriculum.
At present, the program is available through DataCamp, which is offering the curriculum in partnership with Anaconda. (Total Program Cost: $2,850.) Here's a summary of the currently-available offerings, with prices and links, quoted (mostly) verbatim from the DataCamp and Anaconda websites:
Using the Anaconda Python distribution along with Anaconda Project resource bundler and Anaconda Cloud, provides a jump start on creating and sharing reproducible tools, libraries and more. Candidates will organize a local file-tree with command line tools to enable convenient data access for analysis; create, query and manage conda environments, locate and install conda packages into environments, bundle conda packages (pure Python/R), and distribute a complete data analysis with all assets (notebooks, data, environments, etc.) using Anaconda Cloud.
Data Import & Export
Data can be stored many ways, such as in spreadsheets, databases and files, and can live in different places, such as on a local machine or in "the cloud". This creates a challenge for Data Scientists, as all this data needs to be aggregated in order to be analyzed. This module will test your ability to access, aggregate, and manage data on both a local machine and using a remote storage resource.
$200 SEE MORE
Data Manipulation and Analysis
The vast majority of a Data Scientist's day is spent cleaning and transforming data, otherwise known as manipulating, so that it can be analyzed properly. Data manipulation requires skill and knowledge, including understanding different data types, transforming unstructured data, and reshaping datasets. This module assesses your ability to clean, transform, summarize, and filter data to prepare it for analysis using the popular Pandas package.
$200 SEE MORE
One of the most important, but also most overlooked skills in Data Science is data visualization. Good data visualizations help you effectively and efficiently communicate the insights gained through modeling and statistical analyses to your target audience. This module will test your ability to interpret visualization through exploratory data analysis, determine the appropriate visualization for certain data types, build static and interactive visualizations, and turn a data visualization pipeline into a reusable function.
$200 SEE MORE
Statistical Analysis and Inference
Extracting insights from a dataset requires accurate analysis and meaningful interpretation of the data. The most important insights are often derived from statistical analysis and inference. This module will test your ability to: formulate a business problem into a statistical analysis through identifying random variables and factors; perform A/B testing from the construction of a hypothesis through testing and evaluation; and identify various sources of error in the data.
$350 SEE MORE
Machine Learning is one of the most demanded skills in the Data Science industry. Machine Learning involves teaching computers how to "learn" to make fast predictions or generalizations on large datasets. This module will evaluate your knowledge of supervised (classification and regression) and unsupervised (clustering) modeling techniques including pre-processing steps, hyperparameter tuning, and assessing model accuracy.
$350 SEE MORE
Data Science at Scale
Data is becoming bigger and bigger, big enough that it cannot fit on a single machine, but may require multiple servers. Storing, retrieving, and analyzing "big data" requires specialized workflows and techniques. This module will test your ability to work with out-of-core datasets, streaming data, identifying and working with distributed and parallel computing, and deploying predictive models with big data.
$350 SEE MORE
Becoming a Data Scientist requires the skills to work with data from start to finish. A good Data Scientist knows how to import, clean, explore, visualize, and analyze the data and present findings in a clear, compelling manner. This project-based exam covers all the steps in the Data Science Certification path. Passing the comprehensive exam earns candidates the official title of Anaconda Certified Professional (ACP) – Data Scientist.
The company also divulges in a blog post on this program offering that it plans to offer "other learning paths and corresponding certifications soon." It should be interesting to see what else comes from this organization.
In the meantime, the current ACP – Data Scientist looks interesting, reasonably comprehensive, and neither dirt-cheap nor overwhelmingly expensive to acquire. At just under $3,000 to get through the training, the exams, and the project Anaconda's offering falls in the middle of the pack.
Microsoft's Data Science MPP includes 10 courses (with a capstone project as well) for $990. Other programs can easily cost more than $5,000 to complete, including both exams and training. It should be very interesting to observe how Anaconda's offerings fare in the marketplace.
For that we'll have to wait a while, and then see what's up. Stay tuned!