Visualization Final Note

Introduction

The communication of information using graphical representation.

Why Visualization is Important?

• provides an ability to comprehend huge amounts of data
• allows the perception of emergent properties that were not anticipated (new insight)
• often reveals problems with the data itself quickly (anomalies)
• facilitates understanding of both large-scale and small-scale features of the data
• facilitates hypothesis formation

What Should a Good Visualization Achieve?

• Show the data
• Induce the viewer to focus on the substance rather than the methodology, graphic design, the technology of graphic production, etc.
• Avoid distorting the data
• Present many numbers in a small space
• Encourage the eye to compare different pieces of data
• Reveal the data at several levels of detail, from a broad overview to the fine structure
• Serve a reasonably clear purpose: description, exploration, tabulation, or decoration

  • Computer graphics

    – Digital synthesis and manipulation of visual contents (geometry/imaging/rendering/animation)

    – Visual realism is one of the primary goals – Big impact in animation/movies/video game

  • Visualization

    – Applies computer graphics techniques to generate visual display of data

    – Emphasizes on effective communication of information

Perception & Color

Perception

Perception is the process by which we interpret the world around us, forming a mental representation of the environment.
• Processing of our sensory information
– Recognizing (being aware of)
– Organizing (gathering and storing)
– Interpreting (binding to knowledge)

Preattentive Processing

The automatic mechanisms that operate prior to the action of attention
• Uncontrolled perception
• The fast ability to detect features, usually at a rate faster than 10 msec per item
• Time taken to find the target is independent of the number of distracting nontargets
• Allows detecting features in parallel

pre

pre2

pre3

Effective Use of Conjunctions

conjunction

reverse

Color

Primary Colors
• Red, Green, Blue
– liquid crystal, CRT displays
• Red, Yellow, Blue
– paint
• Cyan, Magenta, Yellow
– color printing
• Orange, Green, Violet
– color photography

Cones
– active at normal light levels
– color vision involves cone only
Rods
– sensitive at low light levels
– not sensitive to color
– responsible for our dark-adapted vision
– low influence on color perception

There is an uneven distribution of cones and rods in the retina

cone

Lightness Scales

  • Lightness / Brightness

    – (qualitative) perceived reflectance of a surface

  • Luminance

    – (quantitative) measured amount of light energy weighted by the spectral sensitivity function of the human eye

light

Temporal, Geospatial & Multivariate Data

Some Basic Plots

Bar Charts / Histograms

To show distribution of values of a single variable
• Values are divided into bins
• A bar plot is used so that the height of each bar indicates the number of objects in each bin
• Shape of histogram depends on the number of bins

bin

Box Plots

To show quantitative distribution of 1D data

box

box1

2D Bar Charts

To show the joint distribution of the values of two variables

3D effect not good in showing the exact values, but the correlation can be seen clearly

2d

Line Graphs

line

Scatter Plots

scater

Useful to have arrays (a matrix) of scatter plots can compactly summarize the relationships of pairs of attributes

Contour Plots

To show continuous attributes measured on a spatial grid
Partition the space into regions of similar values;
boundaries of regions are contour lines called iso-value lines, or isolines.

contour

Temporal Data

Set of values that change over time

Common requirements:
– Able to compare many time series simultaneously
– Able to use different visualizations in combination

index

stack

Stack area charts on top of each other
• Useful for showing summation of time-series values (aggregation)
• Limitation:
– negative numbers not supported
– difficult to interpret trends accurately
– meaningless for some kind of data (e.g., temperatures)

horizon

To divide the area plot into horizontal bands and layer them over each others.
• Useful for increasing the data density (i.e. save space) without sacrificing resolution.
• Limitation: Not intuitive and takes time to learn

spiral

Geospatial Data

Data refers to a specific location in the world.
– e.g., population, health data, traffic, etc.
• Visualization techniques used intensively in geographic information systems (GIS), cartography.
• Issues:
– Geographical aggregation
• Recall the London Cholera Case
– Map projection

Map Projections

  • A mapping from a position on Earth (spherical surface) to a position on screen (a flat plane)

  • From longitude+latitude pair (l,j) to screen coordinates (x,y)

All map projections must have distortions

Projection methods differ by spatial properties that they preserve
– Conformal (preserves local angle and thus shape; not area-preserving)
– Equal area (preserves area; shape can change)
– Equidistance (preserves distance from a specific point or line)

p1

p2

p3

p4

p5

p6

p7

p8

p9

p10

p11

Multivariate Data

p12

p13

p14

p15

p16

Dimension Reduction

• To remove some of the dimensions out from the display to avoid cluttering
• Examples: Principle Component Analysis (PCA), Multidimensional Scaling (MDS), Self Organizing Maps (SOM)
• Issue: Resulting dimensions are not the original ones, not intuitive to users

p17

Dimension Ordering

• Crucial for the effectiveness of many visualization techniques
• Relationship among adjacent dimensions are easier to detect than relationship among those positioned far apart, e.g., Parallel Coordinates, Heat Maps
• Use for attribute mapping to highlight important dimensions, e.g., Chernoff face,
• An NP-complete problem equivalent to the Travelling Salesman Problem (TSP)
• Use approximation to compute ordering or by manual ordering (interaction needed)

Volume Data

A volume data is essentially a trivariate scalar function
A scalar value is defined at every (x, y, z) in the volume domain: v = f(x, y, z)
If we have a discrete sampling of the 3D domain, we obtain a voxel (volume element) representation

Trees

Graphs

t1

t2

t3

t4

t5

Graph Drawing

Requirements
• Drawing conventions
– Rules specific to certain applications / professions
• Aesthetics
– Readability, layout
• Constraints

t6

t7

t8

Key issues
• Graph size
– need filtering, clustering?
• Predictability
– similar drawing for the same graph every time?
• Time complexity
– Is real-time interaction possible?

Trees

A tree is a directed acyclic graph:
– Exactly one unique vertex called the root with no parents
– Every vertex except the root has a parent
– There is a path from the root to each vertex
• Trees are good for representing:
– Hierarchies (file systems, web sites, organization charts)
– Branching Processes (family lineage, evolution)
– Decision processes (search trees, game trees, decision trees)

Layouts

  • Indentation
    – Tree depth is encoded by indentation

  • Node-Link Diagrams
    – Nodes connected by lines/curves

  • Layered Diagrams
    – Hierarchical structure represented by layering, adjacency or alignment

  • Enclosure/Containment Diagrams
    – Hierarchical structure represented by enclosure

In general, tree layout can be done efficiently in O(n) or O(n log n) time

t9

t10

t11

t12

t13

t14

t15

t16

t17

t18

t19

t20

t21

t22

t23

t24

t25

Problems:
– Change in dataset causes dramatic discontinuous change
– Orders not preserved (Solution: Ordered Treemaps)

t26

t27

Problems:
– Computation involves an iterative process, which can be inefficient

Networks

Graph Drawing

• Direct calculation based on graph structure
– Spanning tree
– Adjacency matrix layout
• Optimization-based
– Optimizing the graph aesthetic constraints
– Force-directed layout

n1

Spanning Tree can be obtained by
– Breadth-First Search (BFS) / Depth-First Search (DFS)
– Min/max spanning tree

n2

Severe edge crossings and cluttering

n3

n4

n5

n6

Clustering

• Structure-based clustering
– Use only structural information of a graph
• Content-based clustering
– Use semantic data associated with graph elements
– Application specific

n7

n8

n9

n10

n11

n12

n13

n14

n16

Measures the importance of a person in passing information within a network

n15

Measures how close a person is to the others

n17

n18

Measures how well a person’s friends are connected to each other

n19

n20

n21

n22

n23

Text & Document

Text Analytics

  • Text Summarization

    To generate a precise summary of a given large amount of text information

  • Opinion Mining (aka Sentiment Analysis)

    To extract and quantify affections (feelings) and subjective information from a given large amount of textual information

  • Steps further:
    – How do the reviews on your products compare with your competitors’?
    – Any correlations between the blogger sentiments and movie box office?

Levels of Text Representations

  • Lexical level

    To group a string of characters into tokens, which is the basic unit of text to be analyzed and is application dependent

  • Syntactic level

    To parse the purpose of tokens

  • Semantic level

    To extract meaning of the syntactic structures of the full text

tx1

tx2

tx3

tx4

tx5

tx6

tx7

tx8

tx9

tx10

tx11

Interaction

Interaction is useful for integrating human in the data exploration process and applying its perceptual abilities

  • Interaction:
    – Allows the user to interact with the visualization and dynamically change the visualizations according to the exploration objectives (as compared to static visualization on paper)
    – Enables relating and combining multiple independent visualizations
  • Distortion:
    – Allows focusing on details while preserving an overview of the data
    – Shows portions of the data with a high level of detail, while others are shown in a lower level of detail

  • overview+detail: spatial separation

  • zooming: temporal separation
  • focus+context: seamless focus in context
  • linking and brushing: integrate data in different views

i1

i2

i3

i4

i5

i6

Problems of Distortion

• Not suitable if spatial judgment is needed
• Difficult for target acquisition
– Items are displayed away from screen location

i7

i8

  • Copyrights © 2019-2020 Rex