Guess the best job in the US for the past three years running. It’s Data Science, according to Glassdoor, which combines three factors to work out its top-job rankings: number of job openings, salary and overall job satisfaction rating. The Harvard Business Review calls Data Science the ‘sexiest job of the 21st century’. So how can you become a Data Scientist in 2018 and what can you expect to earn in this desirable job?
A Glassdoor study states that the median salary is $110 000 per annum in the US – which means that 50% of Data Scientists would actually earn more than this! The national average is $120,931. In 2018, the demand for Data Science talent in the US is projected to be 50-60% greater than the supply. This translates into a shortage of 140 000 to 190 000 people as well as 1.5 million manager and analysts, states Medium.
This is confirmed by the LinkedIn’s 2017 US Emerging Jobs Report, which stated that the number of Data Scientist roles have grown by over 650% since 2012. Hundreds of companies are hiring these roles – including Analytics Manager, Database Administrator, Data Engineer, Data Analyst and Business Intelligence Developer, which are all included in Glassdoor’s top 50 best jobs, and are connected to Data Science – but only 35 000 people in the US have Data Science skills. How then can you become a Data Scientist in 2018? Firstly, let’s see what Data Scientists actually do.
What do data scientists do?
At its simplest, a Data Scientist analyses data for actionable insights on anything from product development to customer retention to new business opportunities. Here’s how the process works, according to a piece on Medium:
- Frame the problem: You need to work out how to translate the client’s issue into a concrete and well-defined problem. A potential problem could be, ‘How do I work out which prospective customers are likely to buy a particular product’.
- Collect the raw data to solve the problem: You will need to work out what resources (e.g. time, money) you need for this, and what data points are actually worth collecting to solve the problem.
- Process the data: Raw data frequently needs to be cleaned for various problems including errors, missing variables, corruption in the data, etc.
- Explore the data: What are the patterns, trends or correlations?
- Conduct in-depth analysis: Here you will use machine learning, statistical models and algorithms. This is probably the main part of your project, where you apply cutting-edge data analysis to come up with insights and predictions. To create a predictive model, you will use techniques from machine learning.
- Communicate the results: Here’s where you tell the story of your data, or data storytelling, to the various stakeholders. You should now have an accurate machine learning model that can predict the answer to the initial problem.
Think about how valuable these skills are for companies, which now have the capacity to collect vast amounts of data. Says Sarah Mbaka, Mentor at HyperionDev: ‘Data science is exciting, because it unites all fields. I can work in the field of medicine, astronomy or physics. So long as the field has data, they need a data scientist.’ Individuals who can organise and analyse this data for business insights are naturally in high demand.
How do I become a data scientist?
To become a Data Scientist, you’ll want to start learning Python, R and SQL, analytical languages for Data Science. On Glassdoor, 9 out of every 10 job postings ask for knowledge of one of these languages. Besides Python, other popular coding languages for Data Science include Java, Perl, or C/C++. You’ll also learn various data science libraries including Matplotlib, ScikitLearn, NumPy, and NLTK. In addition, you’ll cover Machine Learning, which gives computers the ability to ‘learn’ via statistical techniques. It’s based on pattern recognition. A machine learning example could include the recommendations offered by Netflix. You’ll also cover various data visualisation tools and how to work with unstructured data, whether it’s from social media, video feeds or audio.
To transition into this career, there are three possible routes:
- A Master’s programme – the university setting is time-consuming and expensive but also comprehensive.
- MOOCs (read about the top 20 Data Science MOOCs) – this self-taught approach tends to be scrappy and unguided. When you get stuck, you can get truly lost! On the plus side, it’s inexpensive or even free. (Read the top five questions to answer before deciding you’re up for learning to code on your own.)
- Bootcamps – these are taught by experts in the field and rely on real-life projects. They can be flexible and online. Some bootcamps offer supportive one-on-one mentorship.