Learn Data Science - Page 2

Learn Data Science
Looking Glass

Explore and Discover

Latest Articles, Videos and Blog Posts speak with those interested in analytics-based decision making. Browse all content in the Teradata Data Science Community to gain valuable insights.

220 Views
0 Comments

ConnectedNetworks-YasmeenAhmad-Web-650.png

 

About the Insights

This anonymized visualization was created for a Telco operator analyzing residential Telco lines. The project aimed to identify linkages between line and network hardware performance that may impact customer experience.

 

The dots (nodes) represent DSLAM (Digital Subscriber Line Access Multiplexer) on the Telco's network. DSLAM provide a vital service that can impact customer call experience; they connect customer lines to the main network. DSLAM service levels were measured by metrics, such as attenuation, bit rate, noise margin and output power, and clustered into three performance categories for each line. The purple nodes show DSLAM with excellent performance, orange: good performance and white: poor performance.

 

In the chart only a small number of DSLAMs experienced a high quality of service (purple). These DSLAM were co-located in the same building as the main network infrastructure, hence their proximity to the central network hub results in a premium service. The majority of customers achieve a good experience (orange), however there are a large number of DSLAM delivering a poor service (white) that were found to be located outside of the main city.

 

Customer experience and satisfaction suffers most when customers receive variable network quality. The Telco's primary concern is to ensure customers receive a consistent experience, even if that may be consistently poor due to their location is outside of the main city. The chart pinpoints every DSALM that delivers variable service levels; represented by the shared nodes between the good (orange) and poor (white) clusters. Armed with this data the Telco can now investigate and optimize the variable DSLAM.

 

About the Analytics

This sigma visualization was created using the in-built analytics and visualizations found in the Teradata Aster platform.

 

Data attributes from residential lines across the city were gathered, such as attenuation, bit rate etc. These attributes were clustered to identify performance bands indicating customer network experience.

 

These clusters formed a basis for correlation and regression analyses to determine how the network performance varied in conjunction with factors such as: line technology and length, modem type and configuration, DSLAM, card technology, geographic location etc.

 

The sigma visualization shows only one part of the overall analysis, namely the linkage between DSLAM'’s and network performance.

 

About the Analyst

Yasmeen is one of the most creative and insightful Data Scientists at Teradata. Yasmeen grew up in Scotland, where she enjoys the great outdoors, in particular hiking the Scottish Munros and sea kayaking.

 

Her work has seen her traverse many countries, including the UK, Ireland, Netherlands Turkey, Belgium and Denmark where she covers the finance, telecommunications, retail and utilities industries. Yasmeen specializes in working with businesses to identify their challenges and translate them into an analytical context. She has a unique ability to focus on how businesses can leverage new or untapped sources of data, alongside novel techniques, to enhance their competitive capabilities.

 

Yasmeen has worked with many analytical teams, providing leadership, training, guidance and hands-on support to deliver actionable insights and business outcomes. She uses various analytical approaches, including text analytics, predictive modelling, development of attribution strategies and time series analysis. She believes strongly in the power of visualizations and their ability to communicate complex findings to business users in a way that makes taking action easy.

 

Prior to Teradata, Yasmeen worked as a Data Scientist in the life sciences industry, building analytical pipelines for complex, multi-dimensional data types. Yasmeen also holds a PhD in Data Management, Mining and Visualization, which was carried out at the Wellcome Trust Centre for Gene Regulation & Expression. She has published several papers internationally and is a speaker at International conferences and events. In addition she has taught on MSc courses related to Data Science and Business Intelligence.

 

Yasmeen developed a keen passion for data analytics and visualization through her studies, having always been curious to ask questions and learn more. These skills have allowed Yasmeen to explore many opportunities in multiple disciplines, providing her with an endless world of new challenges!

273 Views
0 Comments

5761pre_728c6911b44912e.jpg

 

It's not enough anymore to simply buy and install packaged software, or to build a custom system to solve discrete problems. Instead, companies that thrive will transform their organizations - both digitally and culturally - to make intelligence pervasive throughout the enterprise.

 

Learn more about Pervasive Intelligence - what it is and why we need it... 

255 Views
0 Comments

Fusing business acumen, data science, and creative visualization, the Burning Leaf of Spending enabled a major bank to detect anomalies in customer spending patterns that indicate major life events, and provided artful insights into the personalized service required to enhance the customer experience, improving lifetime value.

 

 

236 Views
0 Comments

TheLeaf-AlexanderHeidl-Web-650.png

 About the Insights  
'The Leaf' fuses real life imagery with a data visualization to provide a vivid demonstration of where the future of analytics may be going. As technology improves both the graphics and the speed and ease with which data can be visualized, one emerging form is using real life imagery to replace the technical diagrams of the past.

 

The implications are huge. Free of imposing technical diagrams, visualizations using real life imagery allow insights to be easily consumed by anyone, even small children. Marketers can translate product benefits using real life representations. For example, showing farmers the physical benefits of fertilizers and chemical protectants by using real life images of their farms with the different crop growth they can achieve, may translate a sales message with a remarkable insight not many farmers would getfrom graphs alone.

 

The Leaf image was created using Kailash Purang's 'Single Malt Sampler' data set. In this graph the dots (nodes) that form the spine of the leaf are the whisky brands, similar tasting whiskies appear closest together. The lines (edges) link each brand to other brands, which share a flavour characteristic. The result was this near perfect leaf image.

 

Thus 'The Leaf' adapts to what Kevin Slavin refers to in his brilliant TED talk about a world run by algorithms — it is a metaphor to encourage us to think about data and maths from a contemporary point of view.

 

About the Analytics
The underlining data set has been extracted from the Teradata Aster Lens environment and processed with Gephi; an open source tool for visual data analysis and exploration.

 

"The Leaf" applies a Radial Axis Layout, which distributes the nodes on linear axes radiating from a circle. Grouping and ordering the nodes on an axis by degree produces the straight line of nodes along the centre of the graph (leaf). The actual leaf is then automatically drawn by curved edges between the nodes and applying a greenish colour range to nodes and edges. Et voilá, here is "The Leaf" shown in the bottom right of the picture.

 

The single leaf created by the data visualization was added to the real world photograph of the plant using Photoshop. This allows us to see how life like the digitally created leaf appears next to the real world leaves of the plant.

 

About the Analyst
Alexander is a founding contributor to The Art Of Analytics project. He has an unusually strong design eye matched with the technical proficiency to manipulate complex analytical images to emphasise their insights. Alexander is the producer of all The Art Of Analytics images, working with Teradata's Data Scientist Community. He specializes in manipulating Aster Lens and Gephi images to produce the exceptional high quality, high resolution 'Art' pieces found in the collection.

 

Alexander is currently based in Zurich, having grown up near Frankfurt, Germany and graduated from Kingston University in London.

 

Shortly after, he began his analytics career working as a Business Intelligence Project Manager across various industries and geographical regions. During this time Alexander developed a keen understanding of the importance different visual imagery can have on the ability to effectively communicate a message. In particular, when dealing with mixed audiences, no matter the organizational hierarchy, expertise level or language skills; he found that pictures and visualizations were instrumental in forming a common understanding among the audience. Thus Alexander took an early interest in the importance of the form and structure of the different visual elements that aid communication. Today Alexander is working as a cross industry Account Executive for Teradata in Switzerland, looking after and supporting a variety of Teradata customers as well as prospects. His passion for visual representation plays a major role in his current job, as he shares complex concepts and analytic insights with his clients.

 

And when Alexander is not out and about with his customers or prospects, or working late at night creating amazing pieces of 'Analytic Art' you might find him cruising on his motorbike through the Alps or traveling the world with his camera — always on the hunt for the next geocache and picture.

 

 

253 Views
0 Comments

Using analytic techniques that normally follow the "Customer Journey," Teradata Think Big consultants and data scientists use data and analytics to visualize & identify ‘The Human Journey,” allowing Buttle UK to identify and fulfill needs for at risk

 

 

492 Views
0 Comments

Seattle has seven draw bridges that are frequently closed to traffic so that boats can enjoy the beautiful Pacific Northwest. These bridges have sensors that tweet every time they open or close giving us a well formatted dataset to explore and play with. In this post we begin by prepping and profiling the tweets from the last month.

Read more...

274 Views
0 Comments

5784pre_431b53fbd6a5bef.jpg

 

Don’t miss out this opportunity to witness how the new Teradata Analytics Platform modernizes an analytics environment and drives insights that produce high-impact, trusted business outcomes. Register for the webinar on Wednesday, September 12, at 11am PT (2pm ET).

 

Register Now

283 Views
0 Comments

TrappingAnaomalies-YasmeenAhmad-Web-650.png

 

About the Insights

This visualisation represents the detection of anomalous broker behaviours found by an insurance provider. The visual representation of the data highlights how quickly these anomalies become apparent when looking at connections in a graphical format.

 

The dots (nodes) represent quotes that are created by brokers using a platform provided by the insurer. Links between nodes indicate quotes that are associated, i.e. a broker used a previously generated quote (node) to build a new quote (linked node) after making some changes. Typical broker behavior indicates that once a broker has generated a quote, it would only be accessed and refreshed if the quote lifespan ends before a customer has taken a decision to accept the quote. The two clusters in the centre (bluish) depict anomalous behavior, where a broker is continuously returning and refreshing the same quote after changing a small number of attributes on that quote. This indicates the broker is gaming the insurer's system in an attempt to determine how the pricing engine works. This is undesired behavior and a fraudulent use of the insurer's system.

 

The goal of this analysis was to identify how broker's use the insurer's system and understand positive broker behaviours that lead to product sales. The aim was to identify how the system could be improved to support brokers and provide a better experience, as well as find preferential behaviours that support the insurer’s business and could be promoted to less successful brokers. This fraudulent finding was a byproduct of this analysis. The insurer can use this visual as evidence when holding follow-up conversations with the broker involved.

 

About the Analytics

This sigma visualization depicts analysis of data generated by a platform provided by an insurer for their brokers. This system logs all actions carried out by a broker on the platform. The initial part of the analysis involved identification of broker sessions on the platform and matching of sessions to a specific broker and customer. Within these sessions, this analysis focused on the quote related actions logged by the broker platform. These actions were captured and modeled as nodes.

 

Each node represents a quote generated for a customer in a distinct session. Links were created between nodes if the broker accessed the same quote and generated a refreshed quote in a new session. Graph analysis identified two large unexpected clusters of highly interconnected nodes that were anomalous from the other nodes in the dataset.

 

About the Analyst

Yasmeen is one of the most creative and insightful Data Scientists at Teradata. Yasmeen grew up in Scotland, where she enjoys the great outdoors, in particular hiking the Scottish Munros and sea kayaking.

 

Her work has seen her traverse many countries, including the UK, Ireland, Netherlands Turkey, Belgium and Denmark where she covers the finance, telecommunications, retail and utilities industries. Yasmeen specializes in working with businesses to identify their challenges and translate them into an analytical context. She has a unique ability to focus on how businesses can leverage new or untapped sources of data, alongside novel techniques, to enhance their competitive capabilities.

 

Yasmeen has worked with many analytical teams, providing leadership, training, guidance and hands-on support to deliver actionable insights and business outcomes. She uses various analytical approaches, including text analytics, predictive modelling, development of attribution strategies and time series analysis. She believes strongly in the power of visualizations and their ability to communicate complex findings to business users in a way that makes taking action easy.

 

Prior to Teradata, Yasmeen worked as a Data Scientist in the life sciences industry, building analytical pipelines for complex, multi-dimensional data types. Yasmeen also holds a PhD in Data Management, Mining and Visualization, which was carried out at the Wellcome Trust Centre for Gene Regulation & Expression. She has published several papers internationally and is a speaker at International conferences and events. In addition she has taught on MSc courses related to Data Science and Business Intelligence.

 

Yasmeen developed a keen passion for data analytics and visualization through her studies, having always been curious to ask questions and learn more. These skills have allowed Yasmeen to explore many opportunities in multiple disciplines, providing her with an endless world of new challenges!

292 Views
0 Comments

Combining the collaborative expertise of data scientists, geophysicists and data visualization an integrated oil company developed new understandings of complex reservoir management with data and analytics. This business case easily transcends multiple industries focused on asset utilization and optimization.

 

 

308 Views
0 Comments

CallingCircles-ChristopherHillman-Web-650.png

 

About the Insights

The mobile phones that we use everyday and carry around everywhere with us, create huge amounts of data that trace the daily patterns of our behavior. The interactions we have with others through calls or messages map out our social relationships, business dealings and interactions with the wider community as complex interconnected circles of calls.

 

This data visualization is created using mobile phone subscriber calling patterns. Each dot (or node) represents a phone number that is called by a subscriber, the larger the node size the more often it is called. The lines (or edges) between nodes represent a call from one number to another.

 

Each subscriber will have a unique calling pattern that can be used to develop pricing plans, identify him or her and can even predict his or her behavior. For instance a subscriber that is in the process of switching to a different network provider will show up as two similar patterns one from an on-net number and one from an off-net number.

 

This particular chart was produced at the early stage in a series of analytics and was used to filter out the first level of calling patterns types. The data used here represents a very short period of time, just a few seconds. We can see at the top right-hand side of the graph large loops that show numbers, which have been called many times in this short period. These are likely to be machines, such as the auto dialer systems that use pre recorded messages when answered, Interactive Voice Response (IVR) systems, security systems and alarms. Humans would not be able to make so many calls so quickly. These numbers were isolated out as a separate segment and subsequent analysis focused in on the detailed individual human calling patterns.

 

About the Analytics

This visualization shows a representation of a graph, although the layout parameters have been used to create a format that is unlike those typical used to display graphs. An issue commonly faced in this area is that the connected graphs quickly become huge and are almost impossible to visualize due the sheer number of callers and interactions. To take a sample from a highly connected graph is a difficult problem, as we need to decide which connections to ignore. In this case a very short period of time is used to cut down the output to a manageable size.

 

The underlying data format is rather simple, calling number, called number, time of day and duration. The data is first clustered using a machine-learning algorithm to create the groups and then displayed as a graph using Aster Lens.

 

About the Analyst

Christopher Hillman is based in London UK with his wife and two kids and is a Principal Data Scientist in the Advanced Analytics team at Teradata travelling extensively in the International Region.

 

His passion for analytics spans 20 years of experience working in the business intelligence and advanced analytics industries. Prior to Teradata Chris specialized in the Retail and CPGN vertical, working as Solution Architect, Principal Consultant and Technology Director. Chris currently works together with the Teradata Aster Centre of Expertise and is involved in start-up analytics for Big Data projects helping customers to unlock insights from their data including understanding where MapReduce or SQL is an appropriate technique to use.

 

As well as working for Teradata, Christopher is currently studying part-time for a PhD in Data Science at the University of Dundee applying Big Data analytics to the data produced from experimentation into the Human Proteome. His research area involves real-time analysis of Mass Spectrometer data using Parallel algorithms. Part of his duties at the University include lecturing on Hadoop and MapReduce coding.

Bloggers
Top Kudoed Authors

Data Science Informative Articles and Blog Posts

Our blogs allows customers, prospects, partners, third-party influencers and analysts to share thoughts on a range of product and industry topics. Also, we have lots of content in the community; allowing you to gain valuable insights from Teradata data scientists.