<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Immersive Intellegence Colleagues &#187; Blog</title>
	<atom:link href="http://im-tel.org/category/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://im-tel.org</link>
	<description>...exploring collaborative virtual spaces to solve hard problems</description>
	<lastBuildDate>Sun, 08 Apr 2012 17:10:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Human Judgment versus Machine Learning</title>
		<link>http://im-tel.org/2012/04/07/human-judgment-versus-machine-learning/</link>
		<comments>http://im-tel.org/2012/04/07/human-judgment-versus-machine-learning/#comments</comments>
		<pubDate>Sun, 08 Apr 2012 04:07:45 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[BigData]]></category>
		<category><![CDATA[DataMining]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[MachineLearning]]></category>
		<category><![CDATA[VisualAnalytics]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=656</guid>
		<description><![CDATA[This last week a nine-week online course entitled &#8220;Learning From Data&#8221;started, taught by by Caltech Professor Yaser Abu-Mostafa. As they promoted&#8230; &#8220;A real Caltech course, not a watered-down version, broadcast live from the lecture hall at Caltech.&#8221; The course objective is &#8220;machine learning that covers the basic theory, algorithms, and applications, that enables computational systems [...]]]></description>
			<content:encoded><![CDATA[<p>This last week a <a href="http://work.caltech.edu/telecourse.html" target="_blank">nine-week online course entitled &#8220;Learning From Data&#8221;</a>started, taught by by Caltech Professor Yaser Abu-Mostafa. As they promoted&#8230; &#8220;A real Caltech course, <span style="text-decoration: underline">not</span> a watered-down version, broadcast live from the lecture hall at Caltech.&#8221; The course objective is &#8220;machine learning that covers the basic theory, algorithms, and applications, that enables computational systems to adaptively improve their performance with experience accumulated from the observed data.&#8221; A <a href="http://www.amazon.com/Learning-From-Data-Yaser-Abu-Mostafa/dp/1600490069/" target="_blank">book by the same title </a>covering the same material is available.</p>
<p>I am attending (when schedule permits) because I believe that Machine Learning (ML) will (has) become a basic analysis technique of any complex system. However, I was surprised by a recent <a href="http://www.kdnuggets.com/2012/04/sceptical-of-machine-learning.html" target="_blank">poll in KDnuggets</a> that asked: &#8220;Can Machine Learning on Big Data replace Domain Expertise?&#8221; The majority (55%) felt that &#8220;there are many domains where machine learning cannot beat domain expertise&#8221;.  However, Gregory Piatetsky-Shapiro (newsletter editor) argued that there are growing number examples where ML of Big Data outperform domain expertise. Many of the Knowledge Discovery (KD) competitions over the past ten years confirmed this.</p>
<p>I believe that successful applications of ML will involve a synthesis of ML with human domain expertise. The ML component will provide hints and basis instrumentation. However, humans will provide judgment and insights based on their domain expertise. Could a naive domain expert use ML functionality to perform useful analyses? Of course. However, a savvy domain expert could leverage ML much more.</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/04/07/human-judgment-versus-machine-learning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Beginning of Interactive Data Visualization</title>
		<link>http://im-tel.org/2012/04/06/beginning-of-interactive-data-visualization/</link>
		<comments>http://im-tel.org/2012/04/06/beginning-of-interactive-data-visualization/#comments</comments>
		<pubDate>Sat, 07 Apr 2012 03:23:18 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[DataViz]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[VisualAnalytics]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=658</guid>
		<description><![CDATA[I was poking around in Nathan Yau&#8217;s FlowingData blogs and found a historical gem. On January 1, 2008, Nathan wrote a blog on John Tukey, the pioneer in exploratory statistics. I did not realize that Tukey was also a pioneer in the early use of computers for data visualization! In 1972 using &#8220;32 buttons and [...]]]></description>
			<content:encoded><![CDATA[<p>I was poking around in Nathan Yau&#8217;s <a href="http://flowingdata.com/" target="_blank">FlowingData</a> blogs and found a historical gem. On January 1, 2008, Nathan wrote a blog on John Tukey, the pioneer in exploratory statistics. I did not realize that Tukey was also a pioneer in the early use of computers for data visualization!</p>
<p>In 1972 using &#8220;32 buttons and a lightpen&#8221; on &#8220;an Information Display&#8217;s IDIIOM refresh CRT driven by a <a href="http://en.wikipedia.org/wiki/Varian_Data_Machines" target="_blank">Varian 620/i minicomputer</a> linked to an IBM 360/91&#8243;, Tukey developed the PRIM-9 program to do multivariate analysis. It handled up to 9 dimensional data with the functions of &#8220;picturing, rotation, isolation and masking&#8221;. A <a href="http://books.google.com/books?hl=en&amp;lr=&amp;id=pZTIv3uq1KsC&amp;oi=fnd&amp;pg=PA91&amp;dq=Prim-9+J.W.+Tukey,+J.H.+Friedman+and+M.A.+Fisherkeller&amp;ots=4bpFO8HsIJ&amp;sig=V7Z2242N0yUoX_oO8htmm69TPxg#v=onepage&amp;q=Prim-9%20J.W.%20Tukey%2C%20J.H.%20Friedman%20and%20M.A.%20Fisherkeller&amp;f=false" target="_blank">paper in May of 1974</a> describes the operation of his program. Particularly insightful is the Discussion section at the end, in which Tukey gives his best practices for discovering meaningful relationships hidden within the 9 dimensions of the data. Nathan suggests that the <a href="http://ggobi.org/" target="_blank">GGobi </a>visualization software by Hadley Wickham et al owes a bit of its heritage to Tukey&#8217;s PRIM-9.</p>
<p>The real treat is a <a href="http://stat-graphics.org/movies/prim9.html" target="_blank">25-minute video</a> from 1973. Take the time for view this! Despite the awkwardness of ancient computer equipment, Tukey teases out the patterns in what initially appear as random dots.</p>
<p>Toward the end of this video, I was stuck with the implications of Tukey&#8217;s data visualization work to Immersive Intelligence. Here is one of pioneers of modern data analysis showing us the value of 3-D visualizations&#8230;as an early approach to immersing oneself in the data. And this was forty years ago!</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/04/06/beginning-of-interactive-data-visualization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VAST Challenge: What is a Healthy Network?</title>
		<link>http://im-tel.org/2012/03/23/vast-challenge-what-is-a-healthy-network/</link>
		<comments>http://im-tel.org/2012/03/23/vast-challenge-what-is-a-healthy-network/#comments</comments>
		<pubDate>Fri, 23 Mar 2012 19:55:48 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[DataMining]]></category>
		<category><![CDATA[DataViz]]></category>
		<category><![CDATA[VASTchallenge]]></category>
		<category><![CDATA[VisualAnalytics]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=635</guid>
		<description><![CDATA[In the overview blog of the VAST Challenge, we described the background and focus of the challenge, along with available data. In this blog, let&#8217;s probe the criteria for a healthy network has defined in Mini-Challenge 1A: It seems that the criteria for network health is loosely defined. Any anomaly to the normal pattern could [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a title="VAST Challenge: Initial Look" href="http://im-tel.org/2012/03/18/vast-challenge-initial-look/" target="_blank">overview blog</a> of the <a href="http://www.vacommunity.org/VAST+Challenge+2012" target="_blank">VAST Challenge</a>, we described the background and focus of the challenge, along with available data. In this blog, let&#8217;s probe the criteria for a healthy network has defined in Mini-Challenge 1A:</p>
<span id="box_typeshadow_Create_a_visualization_of_the_health_and_policy_status_of_the_entire_bank_enterprise_as_of_2_pm_on_February_2._What_areas_of_concern_do_you_observebox_"><h4><em><div class='et-box et-shadow'>
					<div class='et-box-content'>Create a visualization of the health and policy status of the entire bank enterprise as of 2 pm on February 2. What areas of concern do you observe?</div></div> </em></h4></span>
<p>It seems that the criteria for network health is loosely defined. Any anomaly to the normal pattern could be an area of concern.</p>
<span id="Normal_Operation"><h3>Normal Operation</h3></span>
<p>Normal operation is fairly easy to identify according to the data definitions. A policy status of &#8220;1&#8243; is &#8220;machine is functioning normally and is healthy&#8221;. Likewise, a activity flag of &#8220;1&#8243; is &#8220;normal with only normal activity detected on the equipment&#8221;.</p>
<p>A new table &#8216;health&#8217; was created by joining &#8216;meta&#8217; with &#8216;windowOneSingle&#8217; on &#8216;ipaddr&#8217;. This generated health records for 809,216 devices in 51 business units and 206 facilities. For policy = &#8217;1&#8242; and activity = &#8217;1&#8242;, there were 646,127 (80%) devices that were in normal operation.</p>
<span id="Normal_Business_Hours"><h3>Normal Business Hours</h3></span>
<p>We probably should add normal business hours (7am to 6pm Monday-Friday) to the criteria for &#8216;normal operation&#8217; &#8230;at least for workstation (and not for servers and ATMs). Workstations that should be turned off outside of business hours.</p>
<p>The &#8216;health&#8217; table had records for eight timezones &#8211; BMT-4 to BMT-11. This data was a snapshot at BMT = 14:00 (2:00pm) on February 2, which is a Thursday. Hence, normal business hours is only for timezones BMT-4 to BMT-7, implying the BMT-8 to BMT-11 are before normal business hours. There were records for 84,696 (10%) workstations that were in operation before normal business hours. NOTE: criteria for whether a workstation is turned off is uncertain.</p>
<span id="Normal_Maintenance"><h3>Normal Maintenance</h3></span>
<p>Another criteria for &#8216;normal operation&#8217; could be whether a device is under maintenance and whether is was &#8216;planned on a regular schedule&#8217;.</p>
<p>Out of 809, 216 devices, there were only 743 devices under maintenance (activity = &#8217;2&#8242;) across all timezones. This seems odd! Not much maintenance being performed on this network. Further, there seems to be no indication as to whether this maintenance was planned or not.</p>
<p>More analysis on this &#8216;health&#8217; table will performed using several data viz tools, like Quikview and Tableau, and reported in future blogs.</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/03/23/vast-challenge-what-is-a-healthy-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VAST Challenge: Surveying the Geography</title>
		<link>http://im-tel.org/2012/03/20/vast-challenge-surveying-the-geography/</link>
		<comments>http://im-tel.org/2012/03/20/vast-challenge-surveying-the-geography/#comments</comments>
		<pubDate>Tue, 20 Mar 2012 15:23:29 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[DataViz]]></category>
		<category><![CDATA[VASTchallenge]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=625</guid>
		<description><![CDATA[In the overview blog of the VAST Challenge, we described the background and focus of the challenge, along with available data. In this blog, let&#8217;s survey the geography of this weird planet called BankWorld. It is the same size of Earth, but consist of a single large land mass, about the size of Europe and [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a title="VAST Challenge: Initial Look" href="http://im-tel.org/2012/03/18/vast-challenge-initial-look/" target="_blank">overview blog</a> of the <a href="http://www.vacommunity.org/VAST+Challenge+2012" target="_blank">VAST Challenge</a>, we described the background and focus of the challenge, along with available data. In this blog, let&#8217;s survey the geography of this weird planet called BankWorld.</p>
<p>It is the same size of Earth, but consist of a single large land mass, about the size of Europe and Asia, but situated over North American, the north part of South American and the Pacific Ocean out to the Hawaii Islands.</p>
<p>The best way to visual this geography is Google Earth, especially since the challenge designers gave a set of KML layers. So, bring up your copy of Google Earth, position it over Mexico City so that the whole globe is visible, and turn off all the layers in the Primary Database (lower left box).</p>
<p><a href="http://im-tel.org/files/2012/03/VAST-Global1.png"><img class="alignright size-medium wp-image-627" src="http://im-tel.org/files/2012/03/VAST-Global1-300x248.png" alt="" width="300" height="248" /></a>Ready? Find the folder &#8220;Mini-Challenge 1 Image and Google Earth Files&#8221; that you downloaded. I found that it was best to drag-drop each KML file individually onto the Earth &#8230;transforming it into BankWorld! Do it in this order: BankWorld, BankCenters, Large Regional Offices, Region Boundaries, Small Region Offices, Large Branch Offices, Small Branch Offices, or something like that&#8230;</p>
<p>Play with the check boxes in the upper left. You should see something like the image on the right. Click for hi res version.Get familiar with the geography and the icons. The yellow mailbox is global headquarters for the bank. The yellow &#8216;hairs&#8217; (actually pushpins if you zoom in) are each of the branch offices, each of which have lots of workstations, servers, ATMs, etc.</p>
<p>The important aspect is to note the timezones (not shown) that starts in London (does not exist in BankWorld) and proceeds west on 15 degree intervals of longitude. Hence, a place with longitude of -70.5 would be -4 hours from the BankWorld Mean Time (BMT). IOW Timezone =  integer (Longitude / 15) &#8211; 1. Now you can determine the local time at a branch office, thus monitoring whether transactions are being performed during normal business hours.</p>
<p>This is certainly a creative transformation of Earth&#8217;s geography for this challenge and a fun use of Google Earth (and other KML tools).</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/03/20/vast-challenge-surveying-the-geography/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VAST Challenge: Initial Look</title>
		<link>http://im-tel.org/2012/03/18/vast-challenge-initial-look/</link>
		<comments>http://im-tel.org/2012/03/18/vast-challenge-initial-look/#comments</comments>
		<pubDate>Sun, 18 Mar 2012 19:49:17 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[DataMining]]></category>
		<category><![CDATA[DataViz]]></category>
		<category><![CDATA[Education]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=610</guid>
		<description><![CDATA[The Visual Analytics Community released their VAST Challenge 2012. [By the way, VAST stands for "Visual Analytics Science and Technology".] This challenge has a ten-year lineage initiated by the Human Computer Interface Lab at the University of Maryland and archived at the Visual Analytics Benchmark Repository. The challenge will conclude on July 9 and become [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.vacommunity.org/tiki-index.php" target="_blank">Visual Analytics Community</a> released their <a href="http://www.vacommunity.org/VAST+Challenge+2012" target="_blank">VAST Challenge 2012</a>. [By the way, VAST stands for "Visual Analytics Science and Technology".] This challenge has a ten-year lineage initiated by the Human Computer Interface Lab at the University of Maryland and archived at the <a href="http://hcil.cs.umd.edu/localphp/hcil/vast/archive/viewbm.php" target="_blank">Visual Analytics Benchmark Repository</a>. The challenge will conclude on July 9 and become a session at <a href="http://visweek.org/" target="_blank">IEEE VisWeek</a>, which this year is in Seattle on October 14-19.</p>
<span id="What_is_the_challenge"><h3>What is the challenge?</h3></span>
<p>The challenge deals with &#8220;Big Data&#8221; although the total amount of data is less than 10 GB. The situation is cyber-security for a large bank with hundreds of branch offices spread across a fictitious world, completed with lat/long geographic coordinates and KML annotations.</p>
<p>There are two mini-challenges, only the first of which has been released. Mini-challenge #1 is to provide &#8220;situation awareness of the cyber-health of the bank&#8217;s network. In their words, &#8220;how do you visualize data out of a network containing nearly <strong>a million computers</strong> in a way that you can perceive its state and identify problems?&#8221; Actually, the bank network consists of 895,025 IP addresses, as shown in the table at the right. <a href="http://im-tel.org/files/2012/03/BOM-Organization.png"><img class="alignright size-medium wp-image-614" src="http://im-tel.org/files/2012/03/BOM-Organization-300x105.png" alt="" width="300" height="105" /></a></p>
<p>Mini-challenge #1 requires two responses:</p>
<ul>
<li>A &#8211; Create a visualization of the health and policy status of the entire bank enterprise as of 2 pm on February 2. What areas of concern do you observe?</li>
<li>B &#8211; Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?</li>
</ul>
<p>I download the data sets, consisting to three tables:</p>
<ul>
<li>Meta data about the organization and location of network nodes (workstations, routers, servers) &#8211; 1.1 M rows as 63 MB CSV file</li>
<li>Health status data about each nodes through time &#8211; 80 M rows as 7.8 GB CSV file</li>
<li>Health status data for a time window &#8211; 0.9 M rows as 40 MB CSV file</li>
</ul>
<p>Here is an <a href="http://im-tel.org/files/2012/03/metaDB-20K.zip" target="_blank">Excel file</a> (2 MB) containing the first 20K rows for each of these three tables, along with <a href="http://im-tel.org/files/2012/03/BankWorld-Documentation.zip" target="_blank">two documentation files</a> containing an overview of the banking world and a table explanation.</p>
<p>Using <a href="http://www.wampserver.com/en/" target="_blank">WAMP</a>, the data was loaded into MySQL for profiling. The status data took about an hour to load, without any performance tuning. Still loading&#8230;</p>
<span id="Update_3192012_10:05pm"><h3>Update 3/19/2012 10:05pm</h3></span>
<p>Yesterday the load of the large health status table aborted with the error &#8220;multi-statement transaction required more than &#8216;max_binlog_cache_size&#8217; bytes of storage; increase this mysql variable and try again&#8221;. After consulting with my MySQL technical wizard Roland Bouman, I disable the binary logging for replication in &#8216;my.ini&#8217; configuration file and rerun the load. It took all most two hours, but complete after loading 133 M rows. The stats on the tables are shown at the right&#8230;<a href="http://im-tel.org/files/2012/03/VAST-table-stats.png"><img class="alignright size-full wp-image-623" src="http://im-tel.org/files/2012/03/VAST-table-stats.png" alt="" width="429" height="178" /></a></p>
<span id="Why_is_VAST_Challenge_Relevant_to_Immersive_Intelligence"><h3>Why is VAST Challenge Relevant to Immersive Intelligence?</h3></span>
<p>David Burden of <a href="http://daden.co.uk/" target="_blank">Daden Limited</a> suggested that we form a team to conduct a workshop at the <a href="http://www.ndu.edu/icollege/fcvw/" target="_blank">Federal Consortium for Virtual Worlds</a> in May. As part of that workshop, we have been searching for a problem upon which to focus. So, we are discussing whether the VAST Challenge would be an appropriate context for the workshop. Please join us by commenting below, or by attending FCVW.</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/03/18/vast-challenge-initial-look/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stanford Graduate Certificate in Mining Massive Data Sets</title>
		<link>http://im-tel.org/2012/02/29/stanford-graduate-certificate-in-mining-massive-data-sets/</link>
		<comments>http://im-tel.org/2012/02/29/stanford-graduate-certificate-in-mining-massive-data-sets/#comments</comments>
		<pubDate>Wed, 29 Feb 2012 17:26:39 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[DataMining]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=594</guid>
		<description><![CDATA[This is not new, but this offering amazes me each time I read its description! The Stanford Center for Professional Development at Stanford University offers a &#8216;graduate certificate&#8216; in cutting edge material about Big Data and Data Mining. This is a serious tough sequence of four courses. The cost ranges from $14,000 to $17,000 and [...]]]></description>
			<content:encoded><![CDATA[<p>This is not new, but this offering amazes me each time I read its description! The <strong>Stanford Center for Professional Development</strong> at Stanford University offers a &#8216;<a href="http://scpd.stanford.edu/public/category/courseCategoryCertificateProfile.do?method=load&amp;certificateId=10555807" target="_blank">graduate certificate</a>&#8216; in cutting edge material about Big Data and Data Mining. This is a serious tough sequence of four courses. The cost ranges from $14,000 to $17,000 and will take two years to complete. Shown as follows, the four courses are taught online (with some presence on the Stanford campus).</p>
<ul>
<li><strong>Social and Information Network Analysis</strong> &#8211; how to analyze the structure and dynamics of large networks, how to model links, and how design algorithms that work with such large networks</li>
<li><strong>Machine Learning</strong> &#8211; Design and development of algorithms and techniques that allow computers to &#8220;learn&#8221; by extracting information from data automatically</li>
<li><strong>Mining Massive Data Sets</strong> &#8211; data mining of distributed file systems: Hadoop, map-reduce; PageRank, topic-sensitive PageRank, spam detection, hubs-and-authorities; similarity search; etc etc</li>
<li><strong>Information Retrieval and Web Search</strong>- efficient text indexing; Boolean and vector space retrieval models; evaluation and interface issues; Web search including crawling, etc etc</li>
</ul>
<p>Combining this material with Immersive Intelligence would be awesome! Please contact me if you are enrolled in this program.</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/02/29/stanford-graduate-certificate-in-mining-massive-data-sets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Going Too Far with Predictive Analytics?</title>
		<link>http://im-tel.org/2012/02/28/going-too-far-with-predictive-analytics/</link>
		<comments>http://im-tel.org/2012/02/28/going-too-far-with-predictive-analytics/#comments</comments>
		<pubDate>Tue, 28 Feb 2012 23:56:12 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=586</guid>
		<description><![CDATA[The current issue of KD Nuggets has a poll on &#8220;Was Target wrong in using analytics to find pregnant women?&#8221;. The New York Times detailed Target&#8217;s successful data mining of customer buying patterns to identify pregnant women. There has been a great negative reaction to this story, although there is considerable debate where the Right/Wrong [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.kdnuggets.com/2012/02/new-poll-target-analytics-wrong-to-find-pregnant-women.html" target="_blank">current issue of KD Nuggets</a> has a poll on &#8220;Was Target wrong in using analytics to find pregnant women?&#8221;. The New York Times detailed Target&#8217;s successful data mining of customer buying patterns to identify pregnant women. There has been a great negative reaction to this story, although there is considerable debate where the Right/Wrong line should be in Target&#8217;s situation. Even Colbert<a href="http://www.colbertnation.com/the-colbert-report-videos/408981/february-22-2012/the-word---surrender-to-a-buyer-power" target="_blank"> weighted in</a> on the controversy. In the KD Nuggets poll so far, 75% of about 250 professional data miners have felt that Target did nothing wrong. Watch as this poll unfolds, especially with the variety of comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/02/28/going-too-far-with-predictive-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Innovation from Cross-Disciplinary Research</title>
		<link>http://im-tel.org/2012/01/16/innovation-from-cross-disciplinary-research/</link>
		<comments>http://im-tel.org/2012/01/16/innovation-from-cross-disciplinary-research/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 01:23:08 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[DataViz]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=580</guid>
		<description><![CDATA[From personal experience, I knew that innovative ideas within my discipline often come from research in quite dissimilar disciplines. Michelle Borkin of Harvard University hit that nail squarely, driving it through the 2&#215;4 with this TED talk. She relates medical imaging from MRI scans to astronomy data of distant nebulae. And, then she proceeds from [...]]]></description>
			<content:encoded><![CDATA[<p>From personal experience, I knew that innovative ideas within my discipline often come from research in quite dissimilar disciplines. Michelle Borkin of Harvard University hit that nail squarely, driving it through the 2&#215;4 with <a href="http://www.ted.com/talks/michelle_borkin_can_astronomers_help_doctors.html" target="_blank">this TED talk</a>. She relates medical imaging from MRI scans to astronomy data of distant nebulae. And, then she proceeds from there. Her parting comments is &#8220;You really never know where your next great idea is going to come from.&#8221;</p>
<p>Note the many ways that 3D data is gradually emerging from research in many disciplines. I feel that our current visualization tools are not providing a smooth transition to 3D data analysis from the traditional 2D visualization approaches. Perhaps cross-disciplinary exchanges will provide the necessary catalyst.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2012/01/16/innovation-from-cross-disciplinary-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts about Initial Scenes in IM-TEL worlds</title>
		<link>http://im-tel.org/2011/12/05/thoughts-about-initial-scenes-in-im-tel-worlds/</link>
		<comments>http://im-tel.org/2011/12/05/thoughts-about-initial-scenes-in-im-tel-worlds/#comments</comments>
		<pubDate>Mon, 05 Dec 2011 22:40:07 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[BusinessIntelligence]]></category>
		<category><![CDATA[DataViz]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=551</guid>
		<description><![CDATA[This morning I was thinking about how to illustrate a virtual world used for Immersive Intelligence. It was a familiar topic that usually gets lost in a blizzard of details. However, it occurred to me that the first design choice is to define the nature of the virtual space. This definition then determines the dimensions [...]]]></description>
			<content:encoded><![CDATA[<p>This morning I was thinking about how to illustrate a virtual world used for Immersive Intelligence. It was a familiar topic that usually gets lost in a blizzard of details. However, it occurred to me that the first design choice is to define the nature of the virtual space. This definition then determines the dimensions used in the initial scene. In other words, the space is like the canvas upon which the data (info-objects) will be painted (rendered). Hence, we will refer to this initial scene as the “canvas”.<span id="more-551"></span></p>
<p>Here is a list of possible choices:</p>
<ul>
<li><strong>Plain terrain</strong>: A vast featureless plain with flat Euclidean geometry and downward gravity. Avatars can walk freely on the ground in the X and Z dimensions. There is a definite Y dimension (for the vertical “UP”). The ground level (Y=0) displays the base (or atomic) data, while the upward vertical contains various analytics derived from the base data, such as descriptive statistics, groupings, and clustering. [What would negative Y would imply? Some kind of “drill-down”.]</li>
<li><strong>Free space</strong>: A vast featureless space in all three dimensions, with flat geometry but no gravity. Avatars float with rotation, so that there is no definite vertical. Useful for complex 3D structures (like molecules, star clusters) where the camera perspective is critical for viewing specific behavior.</li>
<li><strong>Flowing Time</strong>: The analogy is like a river of time. Avatars flow along with time, which is attached to one of the three dimensions in either Plain Terrain or Free Space.</li>
<li><strong>Stretchy Dimension</strong>: Similar to Open Terrain, except one or more of XYZ dimensions are not linear scales. For instance, the X dimension could be logarithmic, so that you could compare the very small with the very large. [How could ordinal and interval scales (as opposed to the usual ratio scale) be rendered differently?]</li>
<li><strong>Closed Elliptical</strong>: This canvas is like creating info-objects on a sphere, although it could be any enclosed volume whose slices form ellipses.  The sum of angles of any triangle drawn on its surface is greater than 180°. An example of this canvas is the primary perspective used in the <a href="http://fragileearthstudios.com/terraviz/" target="_blank">TeraViz project</a> by FragileEarthStudios.</li>
<li><strong>Open Hyperbolic</strong>: In contrast to Closed Elliptical canvas, the Open Hyperbolic canvas is like a saddle between two mountain peaks, where the sum of angles of any triangle drawn on its surface is less than 180°. Avatars walking in any direction would continue forever, not returning to a previous place.</li>
<li><strong>Spacetime</strong>: This canvas is an imitation of Einstein’s special relativity that combines space with time. One approach is to render info-objects that automatically flow with time (like the Flowing Time), like an animated weather maps but in 3D. Another approach is to momentarily assigned one XYZ dimension to time so that info-objects are positioned according to time, but with the ability to quickly interchange time with another XYZ dimension. And yet another approach would be to concurrently display the four possible configurations (XYZ, XYT, XTZ, TYZ) in close proximity for easy comparison.</li>
<li><strong>Higher Dimension</strong>: This canvas is a challenge to design! The situation is a data set whose entities have many (thousands!) of attributes. Through dimension-reducing analytics (like cluster analysis), many attributes would be collapsed into a single scene dimension. The usefulness may not be the final result, but the intermediate processing showing the behavior of clustering similar entities.</li>
</ul>
<p>To provide orientation experience for users of the virtual world, an initial scene should have a Viewing Platform where avatars initially &#8220;rez&#8221; and can easily return to as their home. For example, the NASA <a href="http://en.wikipedia.org/wiki/File:Sl_victoria_crater.jpg" target="_blank">Victoria Crater rendition</a> in SecondLife had a platform structure with information boards. The Viewing Platform could also contain Viewing Vehicles where one or more avatars could tour various highlights of the world.</p>
<p>This blog is obvious a discussion piece. So, please comment and share your ideas.</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2011/12/05/thoughts-about-initial-scenes-in-im-tel-worlds/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interactive Dynamic Systems</title>
		<link>http://im-tel.org/2011/11/29/interactive-dynamic-systems/</link>
		<comments>http://im-tel.org/2011/11/29/interactive-dynamic-systems/#comments</comments>
		<pubDate>Tue, 29 Nov 2011 23:54:03 +0000</pubDate>
		<dc:creator>richardh</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[DataViz]]></category>

		<guid isPermaLink="false">http://im-tel.fragileearthstudios.com/?p=544</guid>
		<description><![CDATA[We have concentrate on data visualization too much! That is, visualizing data that has been collected, processed and stored into some database. But, what about data that is generated from equations? How can we visualize this type of data and especially the behavior emerging from the equations? Check out this video by Bret Victor. Simple [...]]]></description>
			<content:encoded><![CDATA[<p>We have concentrate on data visualization too much! That is, visualizing data that has been collected, processed and stored into some database. But, what about data that is generated from equations? How can we visualize this type of data and especially the behavior emerging from the equations?<span id="more-544"></span></p>
<p>Check out this <a href="http://vimeo.com/23839605/" target="_blank">video </a>by <a href="http://worrydream.com/" target="_blank">Bret Victor</a>. Simple and thoughtful! Note how the multi-touch user interface (of the iPad) enhances the interaction with a complex set of differential equations.I like the intuitive way that he manipulates the equations to learn the behavior of the equations.</p>
<p>Also check out the other work that Bret has done!</p>
]]></content:encoded>
			<wfw:commentRss>http://im-tel.org/2011/11/29/interactive-dynamic-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

