Data Science Handwritten Notes
What is Data ?
Measureable units of information gathered or captured from activity of people, places and things.
What is Data Science ?
Data science encapsulates the interdisciplinary activities required to create data-centric artifacts and applications that address specific scientific, socio-political, business or other questions.
What is the importance of Data Science ?
In a world that is increasingly becoming a digital space, organizations deal with zettabytes and yottabytes of structured and unstructured data every day. Evolving technologies have enabled cost savings and smarter storage spaces to store critical data.
What are the Applications of Data Science ?
The Top 10 Data Science Applications are as follows:
- Fraud and Risk Detection
- Internet Search
- Targeted Advertising
- Website Recommendations
- Advanced Image Recognition
- Speech Recognition
- Airline Route Planning
- Augmented Reality
Topics in our Data Science Handwritten Lecture Notes PDF
The topics we will cover will be taken from the following list:
Introduction: Introduction to Data Science, Exploratory Data Analysis and Data Science Process. Motivation for using Python for Data Analysis, Introduction of Python shell iPython and Jupyter Notebook.
Essential Python Libraries: NumPy, pandas, matplotlib, SciPy, scikit-learn, statsmodels
Getting Started with Pandas: Arrays and vectorized conputation, Introduction to pandas Data Structures, Essential Functionality, Summarizing and Computing Descriptive Statistics. Data Loading, Storage and File Formats. Reading and Writing Data in Text Format, Web Scraping, Binary Data Formats, Interacting with Web APIs, Interacting with Databases Data Cleaning and Preparation. Handling Missing Data, Data Transformation, String Manipulation
Data Wrangling: Hierarchical Indexing, Combining and Merging Data Sets Reshaping and Pivoting.
Data Visualization matplotlib: Basics of matplotlib, plotting with pandas and seaborn, other python visualization tools
Data Aggregation and Group operations: Group by Mechanics, Data aggregation, General split-apply-combine, Pivot tables and cross tabulation
Time Series Data Analysis: Date and Time Data Types and Tools, Time series Basics, date Ranges, Frequencies and Shifting, Time Zone Handling, Periods and Periods Arithmetic, Resampling and Frequency conversion, Moving Window Functions.
Advanced Pandas: Categorical Data, Advanced GroupBy Use, Techniques for Method Chaining