Bioinformatic Random Seed

An overview of Mooc course in ML

(The ones that I have been taking)

Difficulty edx MITx> coursera = udacity.
Coursera’s Mathematics for machine Learning: Linear Algebra, Machine Learning with Python are recommended for introductory or review courses.
Those who are not familiar with python can use Udacity project to practice. Manual code review is available.
MITx SDS MM is suitable for people who really want to spend time, relatively hardcore.

1. Coursera

1.1. Mathematics for Machine Learning: Linear Algebra (Imperial College London)

Include application of eigenvalue, eigenvector, matrix transformation. Also practiced pagerank algorithm。
Beginner level, good for linear algebra review.
5 weeks in total, if you are not busy you can finish in 2 weeks.

1.2. IBM AI Engineering Professional Certificate I recommend this specialization. You will get to practice Keras & PyTorch & TF。 Pay by month (<$100). The faster you work through it the cheaper

1.2.1 Machine Learning with Python (Achieved grade 100%)
- Very good intro to common machine learning methods, with clear definitions and practices that are easy to work with
- Include regression, classification, clustering, recommendation engines etc. For each part the algorithm and evaluation metrics are introduced。
- No need for hard core math. 5-6 weeks of study, 3-6 hours per week (You can do it as a faster pace if you’d like)
1.2.2 Scalable Machine Learning on Big Data using Apache Spark (Achieved grade 90%)
- I didn’t particularlly like the teaching style of the teacher.
- It is not really necessary for academia practices.
- Approx. 7 hours to complete
1.2.3Introduction to Deep Learning & Neural Networs with Keras (Achieved grade 100%)
- This course is good for Keras practice.
- Practice of NN (CNN, RNN), Autoencoder, Restricted Boltzmann Machines
- Approx. 8 hours to complete
1.2.4 Deep Neural Networks with PyTorch (Achieved grade 100%)
- Similar with Keras course but a bit more time consuming. Build simple NN. There is peer review for project.
- Approx. 31 hours to complete
1.2.5 Building Deep Learning Models with TensorFlow (Achieved grade 100%)
- Similar with Keras course. Introduced basic concepts such as tensors. You will build some NNs with TF.
- CNN, RNN, Autoencoder, LSTM, RBM. Using backprop and gradient descent optimizer。
- Approx. 13 hours to complete
1.2.6 AI Capstone Project with Deep Learning (Achieved grade 100%)
- Using Keras / PyTorch for image classifier. Comparing pretrained model performance
- Approx. 16 hours to complete 1.3. IBM AI Enterprise Workflow Specialization This Specialization is mainly for the industry, and there is a fictitious company and project setting in the class. However, the content introduction of each course is not particularly clear, and the content design is a bit confusing. I need to Enroll and pay monthly, but I can’t remember the specific amount (~$100?). So the faster you pass, the more cost-effective it is. There is a related certificate that can be tested (https://www.ibm.com/certify/exam?id=C1000-059), but I think it’s rather useless for me, so I didn’t take the test.
1.3.1 Business priorities and Data Ingestion
- Data workflow prelimiary steps including collection & ingestion (Focus on business intuition)
- Approx. 8 hours to complete
1.3.2 Data Analysis and Hypothesis Testing
- Concepts of EDA，data visualization, missing data, estimation and hypothesis testing
- Approx. 11 hours to complete
1.3.3 Feature Engineering and Bias
- Includes transformation, class imbalance, sampling technique, dimensionality reduction, outlier detection. With case studies
- Approx. 12 hours to complete
1.3.4 Machine Learning, Visual Recognition and NLP
- Includes model evaluation & performance metrics. Tree based methods (decision tree, random forest)
- IBM Watson Nature Language Understanding Service 和 Watson Visual Recognition
- Approx. 14 hours to complete
1.3.5 Enterprise Model Deployment
- Using Apache Spark and IBM Watson Studio to deploy ML model
- Application of collaborative filtering, content-based filtering and hyperparameter tuning
- Approx. 9 hours to complete

2. Udacity: Artificial Intelligence nanodegree

The framework of this nanodegree is similar with Berkeley’s online book: http://aima.cs.berkeley.edu/
Contents: Constraint Satififaction, Search, Automated planning, Optimization, Adversarial search, probabilistic models etc.
Practice projects: Sudoku solver，Aireline planning，Tree & Graph search implementation，Part of speech tagging etc.
Code review from TA. It would be really helpful if you are a python beginner.
~ $1000 for 3 month. There are constant promotions

3. Edx: MITx Statistics and Data Science MicroMasters (I am working on it now)

The biggest difference between this project and the other classes I have taken is that it is instructor paced, so it is similar to the school experience. The workload is relatively large, and there are jobs every week. The expection of each course introduced is a 12-hour investment per week, but the time spent has a lot to do with personal foundation. There are posts from senior teaching assistants in the ground, saying that the department (MIT IDSS) pays more attention to it, so you can’t have the mentality of being overwhelmed (like the other online classes I took are quite relaxed). The teacher is really serious, and all the algorithm mathematical proofs used will be explained in detail. A total of 5 courses are $300 each. You can apply for exemption. I didn’t try it (lazy).

There are two tracks (General track / Social science track). The only difference is one elective.

3.1 Data Analysis: Statistical Modeling and Computation in Applications (Ongoing)

My notes on Github: https://github.com/Yolanda-HT/MIT_6.419x
I recommend taking this after Probability (I started it before Probability because that is only offered in fall, and I wanted to start in Feb.)
Homework ddl every Monday and Wednesday. 5 peer reviewed analysis projects. No midterm or final.
You need to have knowledge of linear algebra and probability. All algorithms are explained in detail with proofs.
Contents:
- Genomics and high-dimensional data
- Criminal networks and network analysis
- Prices, economics and time series
- Environmental Data and Spatial Statistics

3.2 Machine Learning with Python-From Linear Models to Deep Learning (Ongoing)

My notes on Github: https://github.com/Yolanda-HT/MIT_6.86x
Recommend for people with background in python & machine learning.
All algorithms are introduced in detail. Need to implement some from scratch。
Contents：
- Representation, over-fitting, regularization, generalization, VC dimension;
- Clustering, classification, recommender problems, probabilistic modeling, reinforcement learning;
- On-line algorithms, support vector machines, and neural networks/deep learning. 3.3 Probability - The Science of Uncertainty and Data (Pending) 3.4 Fundamentals of Statistics (Pending) 3.5 Capstone Exam in Statistics and Data Science (Pending, only available for people who pass the first 4 courses)
4 tests, 2 hours each, taken with webcameras. 2 cheatsheets allowed.