黃哥Python提醒:是誰給你的勇氣,你想從事大數據方面工作?
對不起了朋友,我將私信的問題發出來,是為了幫助你和更多的朋友對大數據有一個正確的認識。
寫此文的來由:一個關注我的網友,在私信中問我,「黃哥您好!您好想問下零基礎26歲,初中學歷,想參加培訓班學編程,學習java大數據還是php好呢?」。
下面是我的回答,「初中學歷,還大數據,這個不是歧視啊,根本幹不了。如果不是強有興趣的,建議干別的去,學歷太低。被別培訓班忽悠了幾萬,最後找不到工作。誰給予你膽量去參加培訓班。先自學看看。」
1、大數據、數據科學、數據分析定義
大數據(Big Data)又稱為巨量資料,指需要新處理模式才能具有更強的決策力、洞察力和流程優化能力的海量、高增長率和多樣化的信息資產。「大數據」概念最早由維克托·邁爾·舍恩伯格和肯尼斯·庫克耶在編寫《大數據時代》中提出,指不用隨機分析法(抽樣調查)的捷徑,而是採用所有數據進行分析處理。大數據有4V特點,即Volume(大量)、Velocity(高速)、Variety(多樣)、Value(價值)。來源於搜狗百科。
數據科學(英語:Data Science),又稱資料科學,是一門利用數據學習知識的學科,其目標是通過從數據中提取出有價值的部分來生產數據產品[1]。它結合了諸多領域中的理論和技術,包括應用數學,統計,模式識別,機器學習,數據可視化,數據倉庫,以及高性能計算。數據科學通過用運用各種相關的數據來幫助非專業人士理解問題。 數據科學技術可以幫助我們如何正確的處理數據的並協助我們在生物,社會科學,人類學等領域進行研究調研。此外,數據科學也對商業競爭有極大的幫助[2]。
數據分析
Data analysis, also known as analysis of data or data analytics, is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains.
Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes, while business intelligence covers data analysis that relies heavily on aggregation, focusing on business information.[1] In statistical applications data analysis can be divided into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data and CDA on confirming or falsifying existing hypotheses. Predictive analytics focuses on application of statistical models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species of unstructured data. All are varieties of data analysis.
Data integration is a precursor to data analysis, and data analysis is closely linked to data visualization and data dissemination. The term data analysis is sometimes used as a synonym for data modeling.
2、必備技能
轉載5 Essential Skills Every Big Data Analyst Should Have - Jigsaw
Essential big data skill #1: Programming
Learning how to code is an essential skill in the Big Data analyst』s arsenal. You need to code to conduct numerical and statistical analysis with massive data sets. Some of the languages you should invest time and money in learning are Python, R, Java, and C++ among others. The more you know, the better–just remember that you do not have to learn every single language out there.
As every IT professional can tell you, if you know one language well, you can easily pick up the rest. Hands on experience with these languages and programming will help in your learning effort. Finally, being able to think like a programmer will help you become a good big data analyst.
Tip: If you』re looking to start learning a programming language, start with Python.
Another important aspect of programming entails interacting with databases through queries and statements. Databases, instructional languages and big data tools should be a part of your repertoire. Tools such as R, HIVE, SQL, Scala, HIVE etc. are something that you should be comfortable with.
Essential big data skill #2: Quantitative Skills
As a big data analyst, programming helps you do what you need to do. But, what are you supposed to do?
The quantitative skills you need to be a good big data analyst answers this question. For starters, you need to know multivariable calculus and linear and matrix algebra. You will also need to know probability and statistics
By learning these skills, you will have a strong foundation in numerical analysis.
Numerical and statistical analysis are core quantitative skills that every good big data analyst needs. This knowledge enables the use of concepts such as neural networks and machine learning.
Essential big data skill #3: Multiple Technologies
Programming is an essential big data analysis skill. What makes it extra special, though, is the versatility. You can, and must, learn multiple technologies that will help you grow as a Big Data analyst.
But, technologies are not limited to programming alone. The range of technologies that a good big data analyst must be familiar with is huge. It spans myriad tools, platforms, hardware and software. For example, Microsoft Excel, SQL and R are basic tools. At the enterprise level, SPSS, Cognos, SAS, MATLAB are important to learn as are Python, Scala, Linux, Hadoop and HIVE.
The actual technologies that you use will depend upon the environment you are working in. It will also vary based on the requirements of your company and project.
The more technologies you are familiar with, the more versatile you will be.
Essential big data skill #4: Understanding of Business & Outcomes
Analysis of data and insights would be useless if it cannot be applied to a business setting. All big data analysts need to have a strong understanding of the business and domain they operate in.
Domain expertise can magnify the impact of the big data analyst』s insights.
Big data analysts can identify relevant opportunities and threats based on their business expertise. Consider the introduction of iPads. When they were introduced, the digital publishing industry was all set for disruption. But, outsiders could not realize the transformation that was possible. It took industry expertise and connections to usher in the era of digital publishing.
Domain expertise enables big data analysts to communicate effectively with different stakeholders. Consider recommending that new employees be added to a factory floor. When pitching it to the CFO it could be positioned as a net increase in top line margins. It may need to be repositioned as a reduction in quality test failures to the operations head. Domain expertise makes these conversations easier and more effective.
Essential big data skill #5: Interpretation of Data
Of all the skills we have outlined, interpretation of data is the outlier. It is the one skill that combines both art and science. It requires the precision and sterility of hard science and mathematics but also call for creativity, ingenuity, and curiosity.
In most companies, a large majority of employees don』t understand their own company』s data. In fact, most employees do not even have a clear idea of where all the data is. These employees often rely on preconfigured reports and dashboards to derive their insights. Unfortunately, this approach is dangerous. It does not provide a holistic view of the data procurement and analysis process.This problem is often compounded by the fragmentation of data systems. As companies grow inorganically, different data silos merge, resulting in a confusing mess.
However, by asking the right questions, a Big Data analyst can embark on a proper exploration of the raw data. The right questions and discoveries can change the course of business for an organization.
In Conclusion
Becoming a big data analyst requires the mastery of the five essential skills. IT professionals have an advantage in learning new programming languages and technologies. Others will need to put in more effort to learn computing skills and technologies. But, softer skills such as business experience and domain expertise level the playing ground.
3、看看企業大數據職位的要求。
參考這篇文章
對一些盲目想從事大數據的朋友的警示。
總結:將自己的知識結構和上面的要求對比,看自己能不能從事大數據工作。建議先從初級程序員干起。初中學歷,不建議轉行當程序員,除非你下很多功夫,但有幾個人能呢?很多人是葉公好龍而已!
推薦閱讀:
TAG:Python |