中文   |    English

Barcode Scanner, data capture Specialist in mainland China

Handheld Scanner, Fixed Mount Scanner, Desktop Scanner, Scan Engine, Data Collector

24-hour hotline:

+86-(0)755-29639948

Contact Us

Contact
Ms. Wang Xiaofeng (Sales Manager)

Mobile

Tel
+86-(0)755-29639948

Fax
+86-(0)755-29639948

Email
sale-wang@venture-iot.cn

Website
http://en.venture-iot.cn/

Address
Room 301, U6 Building, U8 Intelligent Manufacturing Industrial Park, Hangcheng Avenue, Baoan District, Shenzhen, Guangdong Province

Real time machine learning with tensorflow in data collector

Real time machine learning with tensorflow in data collector

The real value of the new dataops platform can only be realized when business users and applications can access the original data and aggregate data from various data sources, and generate data-driven knowledge in time. With machine learning, analysts and data scientists can use historical data and technologies like tensorflow (TF) in real time to make better offline decisions of data-driven business.
In this article, you will learn how to use the tensorflow model to predict and classify tensorflow evaluator * newly released in streamsets data collector 3.5.0 and streamsets data collector edge.
Before going into the details, let's look at some basic concepts.
Machine learning
Real time machine learning with tensorflow in data collector
Arthur Samuel described it as "a field of research in which computers do not need to be explicitly programmed to learn." With the new development of machine learning, computers now have the ability to make predictions, even better than humans, and feel able to solve any problem. Let's review what problems machine learning has solved.
Generally speaking, machine learning is divided into two categories:
Supervised learning
"Supervised learning is a machine learning task to learn a function that maps input to output based on an input-output instance." - Wikipedia.
It involves building an accurate model that can predict results when historical data is labeled as some results.
Common business problems solved by supervised learning:
Binary classification (learning to predict a classification value)
-Will customers buy a specific product? < br >
-Is cancer malignant or benign?
Multi level classification (learning to predict a classification value)
-Does a given piece of text contain virus, threat or obscene content? < br >
-Is this a species of iris Montana, iris blue or iris North America?
Regression (learning to predict a continuous value)
-What is the predicted price of a house for sale? < br >
-What's the temperature in San Francisco tomorrow?
Unsupervised learning
Unsupervised learning allows us to deal with problems with little or no knowledge of what output should look like. It involves creating a model when labels on previous data are not available. In this kind of problems, the structure is derived by clustering the data based on the relationship between variables in the data.
Two common methods of unsupervised learning are K-means clustering and DBSCAN.
Note: the tensorflow evaluator in the data collector and data collector edge currently only supports supervised learning models.
Neural network and deep learning


Neural network is a kind of machine learning algorithm, which can learn and use computing model inspired by human brain structure. Compared with other machine learning algorithms, such as decision tree and logical regression, neural network has higher accuracy.
Andrew ng describes deep learning in the background of traditional artificial neural network. In his speech entitled "deep learning, self-learning and unsupervised learning", he described the idea of deep learning as:
"Using mimicry of brain structure, hope:
-Make the learning algorithm better and easier to use; < br >
-Revolutionary progress in machine learning and artificial intelligence;
I believe it's a good way for us to move towards real AI. "
Common neural networks and deep learning applications include:
Computer vision / image recognition / target detection < br >
Language recognition / natural language processing (NLP) < br >
Recommendation system (products, matchmaking, etc.) < br >
Anomaly detection (network security, etc.)
TensorFlow


Tensorflow is an open source machine learning framework designed for deep neural networks, developed by the Google brain team. Tensorflow supports scalable and portable training on windows and Mac operating systems, including CPU, GPU, and TPU. So far, it is a popular and active machine learning project on GitHub.
Tensorflow in data collector
With the introduction of tensorflow evaluator, you can now create pipelines to capture data or features and generate predictions or classifications in a controlled environment, without having to initiate calls to HTTP or rest APIs for machine learning models provided and published as web services. For example, the data collector pipeline can now detect fraudulent transactions in real time or perform natural language processing on text, because the data is going through various stages for further processing or decision-making before being stored at the destination.
In addition, with the data collector edge, you can run the enabled tensorflow machine learning pipeline on raspberry PI and other devices running on supported platforms. For example, the probability of natural disasters such as floods is detected in high-risk areas to prevent damage to people's property.
Classification of breast cancer


Let's consider whether breast cancer is classified as malignant or benign. Breast cancer is a classic data set that can be used as part of scikit learn. To learn how to train and export a simple tensorflow model using this dataset in Python, look at my code on GitHub. As you'll see, model creation and training are kept to a small scale, and very simple, with only a few hidden layers. An important aspect to note is how to use tensorflow savedmodelbuilder * to export and save models.
*Note: to use tensorflow model in data collector or data collector edge, you should first use tensorflow's savedmodelbuilder to export and save the model in the development language you choose to support, such as python, and in the interactive environment, such as Jupiter notebook.
Once the model is trained and exported using tensorflow's savedmodelbuilder, it is very easy to use it for prediction or classification in the data flow pipeline as long as the model is saved in a location accessible to the data collector or data collector edge.



QR code of mobile websitemobile website
Wechat QR code Wechat QR code