Hands-on Machine Learning with JavaScript
eBook - ePub

Hands-on Machine Learning with JavaScript

Solve complex computational web problems using machine learning

Burak Kanber

Share book
  1. 356 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Hands-on Machine Learning with JavaScript

Solve complex computational web problems using machine learning

Burak Kanber

Book details
Book preview
Table of contents
Citations

About This Book

A definitive guide to creating an intelligent web application with the best of machine learning and JavaScriptAbout This Book• Solve complex computational problems in browser with JavaScript• Teach your browser how to learn from rules using the power of machine learning• Understand discoveries on web interface and API in machine learningWho This Book Is ForThis book is for you if you are a JavaScript developer who wants to implement machine learning to make applications smarter, gain insightful information from the data, and enter the field of machine learning without switching to another language. Working knowledge of JavaScript language is expected to get the most out of the book.What You Will Learn• Get an overview of state-of-the-art machine learning• Understand the pre-processing of data handling, cleaning, and preparation• Learn Mining and Pattern Extraction with JavaScript• Build your own model for classification, clustering, and prediction• Identify the most appropriate model for each type of problem• Apply machine learning techniques to real-world applications• Learn how JavaScript can be a powerful language for machine learningIn DetailIn over 20 years of existence, JavaScript has been pushing beyond the boundaries of web evolution with proven existence on servers, embedded devices, Smart TVs, IoT, Smart Cars, and more. Today, with the added advantage of machine learning research and support for JS libraries, JavaScript makes your browsers smarter than ever with the ability to learn patterns and reproduce them to become a part of innovative products and applications.Hands-on Machine Learning with JavaScript presents various avenues of machine learning in a practical and objective way, and helps implement them using the JavaScript language. Predicting behaviors, analyzing feelings, grouping data, and building neural models are some of the skills you will build from this book. You will learn how to train your machine learning models and work with different kinds of data. During this journey, you will come across use cases such as face detection, spam filtering, recommendation systems, character recognition, and more. Moreover, you will learn how to work with deep neural networks and guide your applications to gain insights from data.By the end of this book, you'll have gained hands-on knowledge on evaluating and implementing the right model, along with choosing from different JS libraries, such as NaturalNode, brain, harthur, classifier, and many more to design smarter applications.Style and approachThis is a practical tutorial that uses hands-on examples to step through some real-world applications of machine learning. Without shying away from the technical details, you will explore machine learning with JavaScript using clear and practical examples.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Hands-on Machine Learning with JavaScript an online PDF/ePUB?
Yes, you can access Hands-on Machine Learning with JavaScript by Burak Kanber in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.

Information

Classification Algorithms

Classification problems involve detecting patterns in data and using those patterns to assign a data point to a group of similar data points. If that's too abstract, here are some examples of classification problems: analyzing an email to determine whether it's spam; detecting the language of a piece of text; reading an article and categorizing it as finance, sports, politics, opinion pieces, or crime; and determining whether a review of your product posted on Twitter is positive or negative (this last example is commonly called sentiment analysis).
Classification algorithms are tools that solve classification problems. By definition, they are supervised learning algorithms, as they'll always need a labeled training set to build a model from. There are lots of classification algorithms, each designed with a specific principle in mind or for a particular type of input data.
In this chapter, we'll discuss four classifiers: k-Nearest Neighbors (KNN), Naive Bayes, Support Vector Machines (SVMs), and random forest. Here's a brief introduction to each of the algorithms:
  • The KNN algorithm is one of the simplest classifiers, and works well when your dataset has numerical features and clustered patterns. It is similar in nature to the k-means clustering algorithm, in that it relies on plotting data points and measuring distances from point to point.
  • The Naive Bayes classifier is an effective and versatile classifier based on Bayesian probability. While it can be used for numerical data, it's most commonly used in text classification problems, such as spam detection and sentiment analysis. Naive Bayes classifiers, when implemented properly, can be both fast and highly accurate for narrow domains. The Naive Bayes classifier is one of my go-to algorithms for classification.
  • SVMs are, in spirit, a very advanced form of the KNN algorithm. The SVM graphs your data and attempts to find dividing lines between the categories you've labeled. Using some non-trivial mathematics, the SVM can linearize non-linear patterns, so this tool can be effective for both linear and non linear data.
  • Random forests are a relatively recent development in classification algorithms, but they are effective and versatile and therefore a go-to classifier for many researchers, myself included. Random forests build an ensemble of decision trees (another type of classifier we'll discuss later), each with a random subset of the data's features. Decision trees can handle both numerical and categorical data, they can perform both regression and classification tasks, and they also assist in feature selection, so they are becoming many researchers' first tool to grab when facing new problems.

k-Nearest Neighbor

The KNN is a simple, fast, and straightforward classification algorithm. It is very useful for categorized numerical datasets where the data is naturally clustered. It will feel similar in some ways to the k-means clustering algorithm, with the major distinction being that k-means is an unsupervised algorithm while KNN is a supervised learning algorithm.
If you were to perform a KNN analysis manually, here's how it would go: first, plot all your training data on a graph, and label each point with its category or label. When you wish to classify a new, unknown point, put it on the graph and find the k closest points to it (the nearest neighbors). The number k should be an odd number in order to avoid ties; three is a good starting point, but some applications will need more and some can get away with one. Report whatever the majority of the k nearest neighbors are classified as, and that will be the result of the algorithm.
Finding the k nearest neighbors to a test point is straightforward, but can use some optimizations if your training data is very large. Typically, when evaluating a new point, you would calculate the Euclidean distance (the typical, high school geometry distance measure we introduced in Chapter 4, Grouping with Clustering Algorithms) between your test point and every other training point, and sort them by distance. This algorithm is quite fast because the training data is generally not more than 10,000 points or so.
If you have many training examples (in the order of millions) or you really need the algorithm to be lightning-fast, there are two optimizations you can make. The first is to skip the square root operation in the distance measure, and use the squared distance instead. While modern CPUs are very fast, the square root operation is still much slower than multiplication and addition, so you can save a few milliseconds by avoiding the square root. The second optimization is to only consider points within some bounding rectangle of distance to your test point; for instance, only consider points within +/- 5 units in each dimension from the test point's location. If your training data is dense, this optimization will not affect results but will speed up the algorithm because it will avoid calculating distances for many points.
The following is the KNN algorithm as a high-level description:
  1. Record all training data and their labels
  2. Given a new point to evaluate, generate a list of its distances to all training points
  3. Sort the list of distances in order of closest to farthest
  4. Throw out all but the k nearest distances
  5. Determine which label represents the majority of your k nearest neighbors; this is the result of the algorithm
A more efficient version avoids maintaining a large list of distances that need to be sorted by limiting the list of distances to k items. Let's now write our own implementation of the KNN algorithm.

Building the KNN algorithm

Since the KNN algorithm is quite simple, we'll build our own implementation:
  1. Create a new folder and name it Ch5-knn.
  2. To the folder, add the following package.json file. Note that this file is a little different from previous examples because we have added a dependency for the jimp library, which is an image processing library that we'll use in the second example:
{
"name": "Ch5-knn",
"version": "1.0.0",
"description": "ML in JS Example for Chapter 5 - k-nearest-neighbor",
"main": "src/index.js",
"author": "Burak Kanber",
"license": "MIT",
"scripts": {
"build-web": "browserify src/index.js -o dist/index.js -t [ babelify --presets [ env ] ]",
"build-cli": "browserify src/index.js --node -o dist/ind...

Table of contents