A data driven analysis of the data science job market
Data science is one of the hottest careers of this decade. A professional in data science develops analytical treatments based on a strong knowledge of statistics and programming to add value from diverse and more often than not unstructured data sets. Over the past few years many companies have realized the potential of data driven decision making and have made in response, huge strides towards collecting as much data as possible. This glut of data has resulted in a strong demand for data scientists in the job market today. Moreover, this trend is expected to carry on well into the future making data science one of the most attractive professions for many job seekers. Here, at VitaeKale we love every job and opportunity alike. However, even we are not immune to this shiny new opportunity which has created such a buzz in job market over the past few years. We will be coming up with a series of data and research driven articles (duh!) giving our readers critical insights on how to break into the world of data.
In this article we will talk about some of the key skills and keywords we found after analyzing quite a few job postings that had this particular keyword “data science”. Moreover we also looked into the profiles of a number of data science professionals to figure out what their key skills and attributes are. Using this information we can get a fair idea of what are some of the most important skills that you either need to highlight or learn to add gravitas to your data science application. This is the part of research which is known as secondary knowledge collection. Here, we do not interview people but scour through multiple information sources to generate key insights and usable information.
Data science is a young profession - relatively speaking. From our analysis we found that the average length of experience for a data scientist was 6 years. This does not exclude professional tenures in other quantitative fields. 60.1% of the data science professionals we analyzed had a master’s degree and 27.9% had a PhD. We also found appreciable diversity, however within the realm of STEM fields at the bachelor’s level. As expected a significant portion of data scientists (39.2%) had engineering degrees. A sizable portion of data science professionals had degrees in mathematics and statistics (22.7%). We found around 12.8% of the professionals had a degree in natural sciences - leaning strongly towards Physics and 12.4% had degrees in economics and finance. Others made up for 13.9%. A breakdown of engineering degrees revealed a propensity towards Computer Science - however not extremely skewed towards the latter (36.84%). Electronics, Chemical and Mechanical engineers also made up the bulk of engineering professionals in data science. One of the interesting takeaways from this part of the analysis is the fact, that a degree in Computer Science while certainly helpful is not an imperative at least at the Bachelor’s level. This is certainly good news for aspiring data scientists who are concerned that their non Computer Science degree might be an impediment. Moving in to graduate degrees, we found engineering degrees had a 21.4% share. Degrees in data science and and analytics constituted 32.3% of the total share. 16.9% of the data science professionals had degrees in mathematics and statistics while a smaller fraction had degrees in the natural sciences. The analysis suggests that a large fraction of professionals specialized in data science / analytics post their bachelor’s degrees.
Based on the available data we can safely concluded while certainly helpful, degrees in computer science or statistics are not an absolute prerequisite for a career in data science. We have also seen that an overwhelming majority of data scientists have STEM degrees. This is certainly expected because of the highly quantitative nature of the profession. A fairly high percentage of data scientists had a master’s degree with a large pool of professionals going to graduate school for analytics / data science programs. If you are a data science aspirant, investing in an advanced degree will definitely help you garner stronger interest from recruiters. The diversity in academic qualifications also suggests that recruiters do not filter out resumes based on degrees or schools attended. They look more intimately into your profile compared to a myriad of other technical professions. This can be attributed to the strong demand for data scientists in the market place today. While it is very encouraging, it also reinforces how critical your resume and LinkedIn profile are. These two tools are a recruiter’s primary source of information about your candidature. It is vital that you list all your relevant skills and projects undertaken in a fairly clear and concise manner. A well structured resume can definitely give you an edge over a large pool of candidates who may have skills and experience similar to yours. Referencing our previous article (Let’s Talk Interviews), about being quantitative, you have to show through your resume how your data science efforts were structured and how they brought value to your organization. Real life data science skills are the key. If your profile can convince a recruiter you posses those, you will be a step closer to landing your dream job.
The first part of the report gave you an overview of professionals in the current data science landscape. Next, we get to our findings from published data science roles and some of the key skills listed in the profiles of data scientists we studied. This will be a useful guide for a data science aspirant on how to structure their resume and assess any skill gaps that might prevent their profile from getting traction.
Through our analysis we came up with a list of skills and attributes which professionals in data science deemed important enough to be included in their professional networking profiles. Python topped the list with 56% of the analyzed candidate’s profiles including this skill. SQL and R Studio ranked second and third respectively. It is not surprising that R Studio ranked behind Python as seen from the available data. The data science community has seen a fairly strong surge in the popularity of Python over the past few years. Until recently R was the language of data science, however Python owing to its versatility has gained immense traction in data science. A lot of production level code in major companies like Netflix is written in Jupyter Notebooks. Other programming languages constituted a smaller share with Matlab at 34.5%, SAS at 28.1%, C++ at 24.1% , Java at 22.0% and C at 22.0%. While not as prevalent as Python, a knowledge of other programming languages is fairly common for data scientists. In terms of other skills, as expected most data scientists listed attributes like data science, ML and statistics very frequently. This is hardly surprising given the job description. Data visualization tool Tableau was also listed by 24.1% of the professionals. Being a huge part of data science efforts, a knowledge of data visualization tools are certainly a must have. Non technical skills like project management and leadership also found their way in the top 20 list. These skills certainly are sought after in a corporate setting. Trusty old MS Office was also listed by a fairly sizable group of professionals (35.4%). A working knowledge of MS Excel, Powerpoint and Word is very standard.
Next, we used the the list of skills gathered from our analysis of data scientist profiles and measured how frequently they appeared in data scientist job postings. Shown are the top 20 skills from job descriptions based of the top 100 attributes from professional’s profiles. We first looked into programming languages. The analysis revealed that Python was the still most critical skill found in 65.9% of job postings. On the other hand R Studio was only listed in 43.0% of the job postings which further reiterates our previous assertion of Python replacing R studio as the preferred data science programming tool. SQL not as frequently mentioned (37.6%) in data science jobs postings is still an important technical skill from a data science perspective. Java and SAS were also found to be important programming languages found in 25.0% and 22.3% of the job postings respectively. Education wise, we found that 90.1% of the job postings in data science asked for at least a bachelor’s degree. We also found that 59.6% of the jobs posted either required or preferred a master’s degree. Interestingly enough, this number is very close to the percentage of data science professionals having a master’s degree (60.1%) as found from the analysis of their profiles. This appears to be a typical case of supply attuned to the demand. While there is a possibility that it could be coincidental, it is very likely that candidates have assessed the requirements of the data science labor market (possibly through peer networks, college career counselors etc.) and are accordingly investing time in graduate degrees to enhance their prospects. Moving on to skills other than programming languages, as expected ML skills and statistics were a top job requirement. Deep learning showed up in 21.7% of job postings, however interestingly we only found that only 14.2% of data scientists had listed it as a skill in their profile - and did not make to the top 20 skills in the previous plot. Similarly, big data analytics appeared in 20.4% of the job postings while only 6.7% of the data science professionals had listed it in their profiles.
This can be analyzed in one of two ways. First, we can assume that some data science professionals have not listed said skills on their professional networking profiles. The major takeaway here is that these skills are in demand. If employers are explicitly listing these skills in job descriptions, then recruiters are looking for professionals with the expertise. If you are indeed proficient in these skills, it is very important they are listed in your resume and your professional networking profile. It will help catch the attention of a recruiter. The second scenario is that the difference represents a skills gap between what the data science market requires and what is currently available. If that is the case, then such a scenario can present as a huge opportunity for a data science aspirant. Any candidate who possesses the high demand - low availability skills sets will have a significant advantage in the job market. To that effect we created a list of skills / attributes with the highest skill gaps. We found business skills like communication and management had the highest skill gaps 37.7% and 27.6%. Like aforementioned it is possible that data scientists may neglect to add these skills in their profiles, however it is also important to have such skills in a business environment and can be certainly helpful in getting your profile noticed. Further, we found a knowledge of statistics was short by 10.4%. A strong knowledge of statistics is fairly imperative in data science and is understandably short. Tensorflow - the python package used for deep learning was also found to be more ubiquitous in job requirements than the skills listed by professionals by 10.4%. Along with deep learning itself (7.5% shortage) it suggests that there is a need for deep learning skills in data science today. For aspiring data science professionals this can present as a great opportunity to develop these skills and effectively market them to recruiters. We also found that a knowledge of Python was more in demand than supply (10.0% shortage), which again can be a great opportunity for aspiring data scientists. This shortage is not very surprising however as we saw that the percentage of professionals with proficiency in R, C++ and Matlab was fairly higher compared to the demand. We can make a cautious inference that while a lot of companies have moved towards Python, many data scientists have be slower adopting it because of their proficiency in other programming languages and the Python learning curve. Aspiring data scientists with Python skills can however find it easier to break into the market owing to its higher demand than possible supply.
Data science is one of the most sought after professions in the job market today. This article is the first of a series of reports that will tackle some of the key questions every aspiring data scientist has. Further down the line we will be including primary research - where we talk to employers and professionals in this field and get you first hand information on the state of the data science job market. We are also working to expand our research into the major employers in this industry and gather usable information on critical sector and employer specific skills. Help us to do this better and get in touch with us with your comments and critique. Let us know how we can create more value for you.