Coding Sessions on Twitch TV

Coding Sessions on Twitch TV

I will update this section after every Twitch session with new material: the .nb, the YouTube link, and all other files (mostly data files) that are needed as input, or comments/addenda that I have after the session.

First session", Apr 26, 2019, 5pm EDT, Operator Notation with Applications

This first session provides a general intro to the operator notation. We will also use some pattern matching and show function composition: right-composition and left-composition. As application we look at free stock and options data from the Deutsche Börse AG, also provided by the AWS data registry.

Second session, May 2, 2019, 5pm EDT, Introduction to Association and Dataset, part 1

This is the first of three parts that introduce Association and Dataset. The Association is a high-performance method to handle key-value pairs of any expression of any type and "behave" like lists. The Dataset is an interactive way to study tabular data, allowing deep drill-down and slicing and dicing of the data in any manner. As example we use the Titanic dataset, the Deutsche Börse data, and the US 2016 presidential election data.

Third session, May 21, 2019, 5pm EDT, Introduction to Association and Dataset, part 2

In the third session we continue our introduction to Association and Dataset. In particular, with the Association we look closer at tree-based modeling and manipulation of the data. We use the Titanic dataset again to understand layers/levels of the data with colors. Next, we use the planets dataset to look closer at non-rectangular data in an Association.

Fourth session, June 4, 2019, 5pm EDT, Introduction to Association and Dataset, part 3

In the fourth session we close our introduction to Association and Dataset with some advanced usages of Associations.

Fifth session, June 18, 2019, 4pm EDT, Dataset, Query, and Web Scraping of Free Data

In this session we close our introduction of the Dataset and introduce the Query (which is what the Dataset is based upon), show some techniques for scraping of data on the web, and then visualize in the Dataset. First, we use the planets data to show the application of all documented Dataset/Query usages. We show the Dataset version and the Query version and the raw data Query version next to one another. Next, we use Query on financial portfolio data (and display results in Datasets). Third, we show different techniques (direct import of html, straight html / tag parsing without JavaScript, html / tag parsing with JavaScript, webMathematica) to extract data from websites and analyse / display results.

Sixth session, Aug 8, 2019, 4pm EDT, Outlier Detection Methods, part 1

In the sixth session we introduce some methods for outlier detection. This is part 1 of 2. The ease-of-use of the Mathematica system makes it particularly easy to handle the data and apply simple methods, advanced methods, diagnostic visualizations of outliers, and we include a more detailed discussion about why outliers should not always be automatically discarded. We look closer at the famous box plot but also discuss caveats, i. e. when not to use the box plot. We exemplify these concepts with various datasets (atmospheric, health, US 2016 presidential election) and introduce the k-nearest neighbor concept which plays very important roles in AI/ML.

Seventh session, Sep 4, 2019, 5pm EDT, Outlier Detection Methods, part 2

The seventh session closes the outlier detection methods intro. We focus specifically on the function FindAnomalies that is new in version 12. After a recap of the basics of (continuous) probability theory comes a mathematical introduction of FindAnomalies, so we can understand its inner workings. We apply it on different distributions and plot the outcomes. We study the outputs of FindAnomalies, learn how to interpret them, and then apply it on the 2016 US presidential election data, use it for visualization of technical trading systems, and finally on major/minor chord detection of sound samples from a church organ. We close with a discussion on several caveats about the use of FindAnomalies.

Eigth session, Sep 20, 2019, 5pm EDT, Some Basics of Continuous Distributions, applied to RarerProbability

In the eigth session I focus on a more detailed explanation of RarerProbability and point out several general characteristics of continuous distributions that I then illustrate with RarerProbability. In particular, I show the exploitation of symmetry to significantly reduce computation time, use numeric computation of integrals instead of symbolic computation, show ways to avoid +/- infinity in indefinite integrals, and emphasize that not all distributions are symmetric, not all have tails, and that not all areas that are part of RarerProbability are in the tails (can be between tails in multi-modal distributions).

Ninth session, Oct 4, 2019, 5pm EDT, Date, Time, and Calendar Functionality

In the ninth session I show the most important date, time, and calendar functions, along with TimeSeries and EventSeries and TemporalData data objects. I show date computations that depend on location, time zone, calendar, as well as holiday computations that depend on the country, and show computations of Wiener processes which I then visualize.

Tenth session, Nov 21, 2019, 5pm EDT, Understanding the Gregorian Calendar with Continued Fractions

In the tenth session I introduce the Convergents function, which shows a list of the convergents of a continued fraction expansion. I then use that to show just how tremendously precise the Gregorian Calendar is that we are using today world-wide. In my opinion, the continued fraction convergents are the best way to understand the leap year rule.

Eleventh session, Dec 19, 2019, 5pm EDT, Managing and Analyzing Large Data

In this session I demonstrate some techniques to speed up both data handling and data analysis through selective pre-processing of the original data. Don't carry data you don't need! Select only what you need, and then load only that into memory! Then I use the Amazon reviews database to conduct various data analyses across the data files, aggregations, and different types of result visualizations.

My home page: Andreas Lauschke Homepage