← All Courses
Learn Data Science logo

Learn Data Science

Coming Soon

Analyze, visualize, and model real-world data

Data Science Pandas NumPy Jupyter Visualization

Coming Soon

Turn raw data into actionable insights. Data science combines statistics, programming, and domain knowledge to extract meaning from datasets. Pandas for manipulation, Matplotlib and Seaborn for visualization, and scikit-learn for modeling.

This course covers the full data science workflow from data cleaning through exploratory analysis to building predictive models.

Start Here — Learning Roadmap

A suggested path from zero to mastery. Follow these steps in order:

  1. Learn Python basics — Variables, loops, functions, and data structures (lists, dicts) are prerequisites for data science
  2. Set up Jupyter notebooks — Install Anaconda or JupyterLab to get an interactive environment for data exploration
  3. Master NumPy arrays — Understand array operations, broadcasting, vectorization, and why NumPy is faster than Python loops
  4. Wrangle data with Pandas — Load CSVs, filter rows, group and aggregate, handle missing values, and merge DataFrames
  5. Visualize with Matplotlib and Seaborn — Create line charts, bar plots, histograms, heatmaps, and scatter plots to communicate findings
  6. Learn statistics fundamentals — Understand distributions, hypothesis testing, p-values, correlation, and confidence intervals
  7. Build ML models with scikit-learn — Train linear regression, decision trees, random forests, and evaluate with cross-validation
  8. Handle real-world data problems — Deal with missing data, outliers, imbalanced classes, feature engineering, and data leakage
  9. Communicate results — Create compelling data stories with clear visualizations, dashboards, and written analysis
  10. Scale to production — Use Polars for large datasets, SQL for database queries, and learn MLOps basics for model deployment

Official & Core Documentation

  • Pandas Documentation — Data manipulation library with getting started tutorials and API reference (All levels)
  • NumPy Documentation — Numerical computing fundamentals, array operations, and linear algebra (Beginner)
  • scikit-learn User Guide — Machine learning algorithms with theory, examples, and parameter tuning (Intermediate)
  • Matplotlib Documentation — Plotting and visualization reference with gallery of examples (Beginner)
  • Seaborn Documentation — Statistical data visualization built on Matplotlib with elegant defaults (Beginner)
  • Jupyter Documentation — Interactive notebook environment for data exploration and presentation (Beginner)
  • Polars Documentation — Fast DataFrame library as a modern Pandas alternative for large datasets (Intermediate)
  • SciPy Documentation — Scientific computing library for optimization, statistics, and signal processing (Intermediate)
  • AI & Data Scientist Roadmap — Visual step-by-step guide to the data science learning path (Beginner)

GitHub Awesome Lists & Curated Collections

Interactive Courses & Hands-On Platforms

Free Courses

University & MOOC Courses

Practice & Challenges

  • Kaggle Competitions — Real ML challenges with datasets, leaderboards, and prize money (All levels)
  • DrivenData — Data science competitions for social good with real-world impact (Intermediate)
  • StrataScratch — SQL and Python interview questions sourced from real company interviews (Intermediate)

Video Courses & YouTube Channels

Structured Course Playlists

Individual Creators & Channels

  • StatQuest with Josh Starmer — Statistics and ML concepts explained clearly with animations (All levels)
  • Krish Naik — ML, deep learning, and data science with hands-on tutorials (Intermediate)
  • Ken Jee — Data science projects, Kaggle walkthroughs, and career advice (Beginner)
  • codebasics — Data analytics and data science through practical project-based tutorials (Beginner)
  • sentdex — Python programming for data science, ML, and financial analysis (Intermediate)
  • Alex The Analyst — Data analytics tutorials, portfolio projects, and career guidance (Beginner)

Books & Long-Form Reading

Free Online Books

Essential Paid Books

Community, Practice & News

Forums & Discussion

Newsletters & Blogs

  • Data Elixir — Weekly curated data science news, articles, tools, and resources
  • Data Science Weekly — Free weekly digest of data science, ML, and AI articles and jobs
  • Towards Data Science — Medium publication with thousands of data science tutorials and insights
  • KDnuggets — Data science news, tutorials, cheat sheets, and career advice since 1997

Ecosystem Resources

Tools & Environments

  • JupyterLab — Interactive notebook environment for data exploration, visualization, and documentation
  • Google Colab — Free cloud-based Jupyter notebooks with GPU access and pre-installed libraries
  • Anaconda — Python distribution with data science packages pre-installed and environment management
  • Streamlit — Turn Python data scripts into shareable web apps with minimal code
  • DuckDB — Fast in-process analytical database that queries Pandas DataFrames, Parquet, and CSV directly
  • Observable — Interactive data visualization and analysis notebooks for the web
  • Kaggle Notebooks — Free cloud notebooks with GPU/TPU access and direct dataset integration
  • Deepnote — Collaborative data science notebooks with real-time editing and SQL integration

Cheat Sheets & Quick References

Project Ideas & Datasets