Machine Learning Pipeline AWS Sagemaker WorkFlow Process

Data Collection & Integration

  • Predition: Lable / Target.
  • Good Data: Good Data will contain a signal about the phenomenon you’re trying to model.
  • Observation: A single data point, made up of the label and the features.
  • Dataset: Stacked up with a bunch of observations.
  • Data points : Features – Ratio: You need at least 10 times the number of data points as features. So if you’ve got five features, you should have 50 data points minimum in your training data.

Data Preparation

Data Visualization & Analysis

  • Histograms: Histograms are effective visualizations for spotting outliers in data.
  • Imputation: Imputation is going to make a best guess as to what the value actually should be. In a regression problem, you can deal with outliers or even missing data by just assigning a new value using imputation.
  • Scatter Plots: Visualize the relationship between the features and the labels. It’s important to understand if there’s a strong correlation between features and labels.

Feature Selection & Engineering

Model Training

  • Randomize Data: Randomize it during your split to help your model avoid bias. This is especially true with structured data, if your data coming in a specific order.
  • Underfitting: Low variance and high bias. These models are overly simple and they can’t really see the underlying patterns in the data.
  • Overfitting: High-variance and low bias. These models are overly complex, and while they can detect patterns in the training data, they’re not accurate outside of the training data.
  • Parameter:
    • Internal of the model and it’s something the model can learn or estimate purely off of the data.
    • An example of a parameter could be the weight of an ANN or the coefficients in linear regression.
    • The model has to have parameters to make predictions, and most often, these aren’t set by humans.
  • Hyperparameters: Set by humans, and typically, you can’t really know the best value of the hyperparameter, but you can trial and error and use that to get there.
    • It could be the learning rate for training a neural network.
  • Hyperparameter Tuning: One technique that can be used to combat underfitting and overfitting.
  • Types of Hyperparameter Tuning:
    • Loss function
    • Regularization
    • Learning Parameters

Model Evaluation



Machine Learning Algorithms

Machine Learning Algorithms

Read more
Introduction to Machine Learning - Artificial Intelligence / Machine Learning / Deep Learning / Neural Networks / Algorithms

Artificial Intelligence

AI is intelligent behavior by machines, that means any device that can perceive its environment and take actions accordingly, has AI.

AI relies on something called knowledge engineering.

Read more
RM510Q-GL 5G LTE OpenWrt Modem USB PCI-E how-to 楽天モバイル バンド3 固定

This article will introduce how to use RM510Q-GL 5G LTE Modem with OpenWrt.

Read more
Angular Routing and Single Page Applications Notes


Read more
Angular Services Notes


Read more
Google Angular Notes


Read more
Intro Docker in OpenWrt / Dockerman / Images


Package Description
luci-app-dockerman LuCI Support for docker

Docker Manager for LuCI

Read more
Intro OpenWrt Dynamic DNS (DDNS)



Domain Provider Info

Set up Dynamic DNS

Read more
Notes for CI/CD

DevOps methodology.
Some tools that might make deployments easier, such as infrastructure as code, or IAC, and AWS CodeDeploy

Read more