Data Scientist - Reverse Engineering, Machine Learning & AI ...

The best minds from Teradata, our partners, and customers blog about relevant topics and features.
Teradata Employee

Reverse engineering, also called back engineering, is the processes of extracting knowledge or design information from anything man-made and re-producing it or re-producing anything based on the extracted information.[1]:3 The process often involves disassembling something (a mechanical deviceelectronic component, computer program, or biological, chemical, or organic matter) and analyzing its components and workings in detail.

The common goal of reverse engineering is eventually to 'forward engineer' or make modifications to enhance the system with a full knowledge of its workings.

Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed (Arthur Samuel, 1959).[1] Evolved from the study of pattern recognition and computational learning theory in artificial intelligence,[2] machine learning explores the study and construction of algorithms that can learn from and make predictions on data[3] – such algorithms overcome following strictly static program instructions by making data driven predictions or decisions,[4]:2 through building a model from sample inputs. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms is infeasible; example applications include spam filtering, detection of network intruders or malicious insiders working towards a data breach,[5] optical character recognition (OCR),[6] search engines and computer vision.

The common goal of machine learning is usually to learn a model on how a system works, so we can predict the output with algorithms, given unknown inputs ..

What's the difference in approaches ?

Reverse engineering in a business world means, we are trying to understand the levers in a system on how the outputs of a system (regulatory compliance, marketing attribution, loan approval) maps to inputs so it can be explained. Easier said than done. As time goes up, systems can get so complex, where the output of reverse engineering can be really unusable no matter how well it's documented. Businesses hire consulting companies to audit, unravel what legacy systems are made up of in an effort to improvise. It is supposed to be a deterministic approach. However what may not be possible is to track down dependencies each time a new feature is added to the system. Dollars wasted on exploring this way.

Machine learning/AI approach to the same problem can potentially treat the system as a black box arriving at a model that maps inputs to outputs by studying historical data. At some point the models can mimic the original system and translate a given set of inputs to desired outputs. Machine learning/AI approaches are really good approximations to reverse + forward engineering solution and is less troublesome. By using the right algorithms, the rules embedded in the system can be explained away by visualizations.

Other Tradeoffs ..

Using Machine Learning/AI to learn to "mimic" an existing system is lot faster easier given the right conditions. Biggest tradeoff is the 'explainability' the models discover. Whether it's bluffing in a poker game or translating one language to another, machine learning/AI approaches have proven to be awesome. Downside is for humans to interpret and understand the models that was discovered by machine learning techniques. Could be a problem explaining to the stakeholders, however still useful.