1425: Tasks - Revision history

172.71.241.145: /* Explanation */

2025-04-24T09:26:36Z

‎Explanation

172.68.22.108: /* Explanation */ remove extra words

2025-04-24T04:37:05Z

‎Explanation: remove extra words

172.68.22.108: /* Explanation */ move GPS, remove excess words, more on positioning systems

2025-04-24T04:34:59Z

‎Explanation: move GPS, remove excess words, more on positioning systems

172.68.22.109: /* Explanation */ wikilink

2025-04-24T04:20:37Z

‎Explanation: wikilink

172.68.22.108: /* Explanation */ location - GPS or other positioning (GLONAS, wifi, ...)

2025-04-24T04:15:58Z

‎Explanation: location - GPS or other positioning (GLONAS, wifi, ...)

172.68.22.108: /* Explanation */ we also use sound and focal distance. More clearly differentiate the image understanding from the more general problem posed in the comic

2025-04-24T04:09:07Z

‎Explanation: we also use sound and focal distance. More clearly differentiate the image understanding from the more general problem posed in the comic

172.71.150.168: /* Explanation */ an app doesn't have to just use a static photograph

2025-04-24T03:30:11Z

‎Explanation: an app doesn't have to just use a static photograph

DollarStoreBa'al: Undo revision 374636 by Uihr (talk) Ok tf is this

2025-04-23T20:35:38Z

Undo revision 374636 by Uihr (talk) Ok tf is this

Uihr at 19:55, 23 April 2025

2025-04-23T19:55:01Z

141.101.99.88: /* Explanation */ Without trying to deal with the basic facts, a smattering of changes to make the long run-on sentence more readable.

2025-01-31T08:42:24Z

‎Explanation: Without trying to deal with the basic facts, a smattering of changes to make the long run-on sentence more readable.

@@ Line 8: / Line 8: @@
 ==Explanation==
-[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a {{w|national park}}, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
+[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a {{w|national park}}, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}} and {{w|English draughts#Computer players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
-In order to determine whether the user is in a national park, Ponytail plans to determine the user's location using the location tracking receivers which are common in to many devices.  These provide location information using nearby radio sources, such as cell phone towers or WiFi hotspots, or the positions of satellites supplied by a {{w|Global Positioning System|GPS}} receiver. This location will then be checked with a {{w|geographic information system}} (GIS) which will determine whether the photographer is in a national park.
+In order to determine whether the user is in a national park, Ponytail plans to determine the user's location using the location tracking receivers which were available in various camera devices at the time of the comic. These provide location information using nearby radio sources, such as cell phone towers/WiFi hotspots or the positions of satellites supplied by a {{w|Global Positioning System|GPS}} receiver. This location will then be checked with a {{w|geographic information system}} (GIS) which will determine whether the photographer is in a national park.
 Determining whether an image is of a given kind of natural object is far more difficult. This task falls into the area of {{w|computer vision}}. One of the goals in computer vision is to detect and classify objects within an image.
-Humans use size, focus, edge-assignment, movement (of both the subject and the observer), and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. Sound may also assist in locating and identifying objects.  An app could use these techniques, as well as additional senses, such as distance to the subject and light outside our visual spectrum.
+Humans use size, focus, edge-assignment, movement (of both the subject and the observer) and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. Sound may also assist in locating and identifying objects. An app could use these techniques, as well as additional senses, such as distance to the subject and light outside our visual spectrum.
-Identifying objects in a photograph is harder.  A photograph is a static, usually monoscopic image that can only provide size and edge-assignment clues. Humans are only able to discern objects from background in photographs by comparing the photo against all of the things they've seen and everything they've learned about those things over the course of their life and {{w|Visual perception|identifying matching patterns}}.
+Identifying objects in a photograph is harder. A photograph is a static, usually monoscopic image that can only provide size and edge-assignment clues. Humans are only able to discern objects from background in photographs by comparing the photo against all of the things they've seen and everything they've learned about those things over the course of their life and {{w|Visual perception|identifying matching patterns}}.
 The quality of the photograph will have an impact on a computer's ability to match patterns. For example, the object in the photograph might be partially visible or occluded. In the case of a living bird, additional complications arise from the variations among individual birds of the same species and differences in pose (flying, perching in a tree, etc.). Differentiating between visually similar objects can result in false positives. For example, is it a photo of a [[1792: Bird/Plane/Superman|bird in flight or a plane <s>(or superman!)</s>]]? Ponytail's estimate of 5 years may be overly optimistic (see [[678: Researcher Translation]]).
@@ Line 24: / Line 24: @@
 The subtitle refers to "CS", a common abbreviation for "{{w|Computer Science}}", of which {{w|artificial intelligence}} and {{w|computer vision}} are sub-disciplines.
-The title text mentions [http://dspace.mit.edu/bitstream/handle/1721.1/6125/AIM-100.pdf The Summer Vision Project] and {{w|Marvin Minsky}} of MIT. In the summer of 1966, he asked his undergraduate student {{w|Gerald Jay Sussman}} to [http://szeliski.org/Book/ "spend the summer linking a camera to a computer and getting the computer to describe what it saw"]. {{w|Seymour Papert}} drafted the plan, and it seems that Sussman was joined by {{w|Bill Gosper}}, {{w|Richard Greenblatt (programmer)|Richard Greenblatt}}, {{w|Leslie Lamport}}, Adolfo Guzman, Michael Speciner, John White, Benjamin, and Henneman - in case the multiple Wikipedia links don't give it away, know that this is a sizable cross-section of the AI researchers of the period. The project schedule allocated one summer for the completion of this task. The required time was obviously significantly underestimated, since dozens of research groups around the world are still working on this topic today.
+The title text mentions [http://dspace.mit.edu/bitstream/handle/1721.1/6125/AIM-100.pdf The Summer Vision Project] and {{w|Marvin Minsky}} of MIT. In the summer of 1966, he asked his undergraduate student {{w|Gerald Jay Sussman}} to [http://szeliski.org/Book/ "spend the summer linking a camera to a computer and getting the computer to describe what it saw"]. {{w|Seymour Papert}} drafted the plan, and it seems that Sussman was joined by {{w|Bill Gosper}}, {{w|Richard Greenblatt (programmer)|Richard Greenblatt}}, {{w|Leslie Lamport}}, Adolfo Guzman, Michael Speciner, John White, <!--firstname? -->Benjamin and <!--firstname? -->Henneman — in case the multiple Wikipedia links don't give it away, know that this is a sizable cross-section of the AI researchers of the period. The project schedule allocated one summer for the completion of this task. The required time was obviously significantly underestimated, since dozens of research groups around the world are still working on this topic today.
 A month after this comic came out, {{w|Flickr}} [http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/ responded] with a [http://parkorbird.flickr.com/ prototype online tool] to do something similar to what the comic describes, using its automated-tagging software. According to them, the bird solution "took us less than 5 years to build, though it's definitely a hard problem, and we've still got room for improvement".
-Now, years later, the second problem of detecting birds (or any other objects) in the image has also turned into a relatively easy application of existing technologies. Image classification neural networks are readily available.  Many groups have put in the years of research (with teams of computer scientists) into the problem of computer vision, and thanks to recent breakthroughs in neural net architectures.
+Now, years later, the second problem of detecting birds (or any other objects) in the image has also turned into a relatively easy application of existing technologies. Image classification neural networks are readily available. Many groups have put in the years of research (with teams of computer scientists) into the problem of computer vision, and thanks to recent breakthroughs in neural net architectures.
 ==Transcript==

@@ Line 14: / Line 14: @@
 Determining whether an image is of a given kind of natural object is far more difficult. This task falls into the area of {{w|computer vision}}. One of the goals in computer vision is to detect and classify objects within an image.
-Humans use size, focus, edge-assignment, movement (of both the subject and the observer), and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. Sound may also assist in locating and identifying objects.  An app could use these techniques, as well as additional information, such as additional senses, such as distance to the subject and light outside our visual spectrum.
+Humans use size, focus, edge-assignment, movement (of both the subject and the observer), and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. Sound may also assist in locating and identifying objects.  An app could use these techniques, as well as additional senses, such as distance to the subject and light outside our visual spectrum.
 Identifying objects in a photograph is harder.  A photograph is a static, usually monoscopic image that can only provide size and edge-assignment clues. Humans are only able to discern objects from background in photographs by comparing the photo against all of the things they've seen and everything they've learned about those things over the course of their life and {{w|Visual perception|identifying matching patterns}}.

@@ Line 8: / Line 8: @@
 ==Explanation==
-[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a {{w|national park}}, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information, such as supplied by a {{w|Global Positioning System|GPS}} receiver or other positioning network, and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
+[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a {{w|national park}}, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
-In order to determine whether the user is in a national park, Ponytail plans to determine the user's location using the mobile device. This location will then be cross checked with a {{w|geographic information system}} (GIS) which will be able to determine whether the coordinates lie within a national park boundary.
+In order to determine whether the user is in a national park, Ponytail plans to determine the user's location using the location tracking receivers which are common in to many devices.  These provide location information using nearby radio sources, such as cell phone towers or WiFi hotspots, or the positions of satellites supplied by a {{w|Global Positioning System|GPS}} receiver. This location will then be checked with a {{w|geographic information system}} (GIS) which will determine whether the photographer is in a national park.
-Determining whether an image is of a given kind of natural object is far more difficult. This task falls into the area of {{w|computer vision}}. One of the goals in computer vision is to detect and classify objects within an image. This is a very challenging task for a number of reasons.
+Determining whether an image is of a given kind of natural object is far more difficult. This task falls into the area of {{w|computer vision}}. One of the goals in computer vision is to detect and classify objects within an image.
-Humans use size, focus, edge-assignment, movement (of both the subject and the observer), and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. Sound may also assist in locating and identifying objects.  An app could use these techniques, as well as additional information, such as distance to the subject and observations of light outside our visual spectrum.
+Humans use size, focus, edge-assignment, movement (of both the subject and the observer), and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. Sound may also assist in locating and identifying objects.  An app could use these techniques, as well as additional information, such as additional senses, such as distance to the subject and light outside our visual spectrum.
 Identifying objects in a photograph is harder.  A photograph is a static, usually monoscopic image that can only provide size and edge-assignment clues. Humans are only able to discern objects from background in photographs by comparing the photo against all of the things they've seen and everything they've learned about those things over the course of their life and {{w|Visual perception|identifying matching patterns}}.

@@ Line 8: / Line 8: @@
 ==Explanation==
-[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a national park, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information, such as supplied by a {{w|Global Positioning System|GPS}} receiver or other positioning network, and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
+[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a {{w|national park}}, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information, such as supplied by a {{w|Global Positioning System|GPS}} receiver or other positioning network, and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
 In order to determine whether the user is in a national park, Ponytail plans to determine the user's location using the mobile device. This location will then be cross checked with a {{w|geographic information system}} (GIS) which will be able to determine whether the coordinates lie within a national park boundary.

@@ Line 8: / Line 8: @@
 ==Explanation==
-[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a national park, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
+[[Cueball]] appears to be asking [[Ponytail]] to write an app that determines if a given picture is (1) taken in a national park, and (2) a picture of a bird. The first question is generally harder for a human to answer, but easy for an app that has access to location information, such as supplied by a {{w|Global Positioning System|GPS}} receiver or other positioning network, and a {{w|geographic information system}} (GIS). The second one is easy for a human but much harder for a computer. This illustrates {{w|Moravec's paradox}} from the 1980s in a modern context. By the 1950s computers were useful for tasks like {{w|trajectory optimization}}, {{w|automated theorem proving|generating novel mathematical proofs}}, and {{w|English_draughts#Computer_players|the game of checkers}}, so such high-level computation and reasoning tasks that were hard for humans turned out to be relatively easy for them. On the other hand, it turns out to be hard to "give them the skills of a one-year-old when it comes to perception", as Moravec wrote.
 In order to determine whether the user is in a national park, Ponytail plans to determine the user's location using the mobile device. This location will then be cross checked with a {{w|geographic information system}} (GIS) which will be able to determine whether the coordinates lie within a national park boundary.

@@ Line 14: / Line 14: @@
 Determining whether an image is of a given kind of natural object is far more difficult. This task falls into the area of {{w|computer vision}}. One of the goals in computer vision is to detect and classify objects within an image. This is a very challenging task for a number of reasons.
-Firstly, humans use size, edge-assignment, movement, and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. A photograph, however, is a static, monoscopic image that can only provide size and edge-assignment clues. Humans are only able to discern objects from background in photographs by comparing the photo against all of the things they've seen and everything they've learned about those things over the course of their life and {{w|Visual perception|identifying matching patterns}}.
+Firstly, humans use size, edge-assignment, movement (of both the subject and the observer), and stereoscopic vision when looking at a scene (not a picture of a thing, but the thing itself) to discern individual objects and then {{w|Figure-ground (perception)|categorize them as foreground or background}}. An app could use these properties, as well as additional information, such as distance to the subject and observations of light outside the visual spectrum.
 Secondly, the quality of the photograph will have an impact on a computer's ability to match patterns. For example, the object in the photograph might be partially visible or occluded. In the case of a living bird, additional complications arise from the variations among individual birds of the same species and differences in pose (flying, perching in a tree, etc.). Differentiating between visually similar objects can result in false positives. For example, is it a photo of a [[1792: Bird/Plane/Superman|bird in flight or a plane <s>(or superman!)</s>]]? Ponytail's estimate of 5 years may be overly optimistic (see [[678: Researcher Translation]]).

@@ Line 20: / Line 20: @@
 The state-of-the-art algorithms for solving this kind of task (as of this comic's publishing) use local features (e.g. {{w|Scale-invariant feature transform|SIFT}} or {{w|Speeded up robust features|SURF}} in combination with a {{w|support vector machine}}) or a {{w|convolutional neural network}}.
-The subtitle refers to "CS", which is a common abbreviation for "{{w|Computer Science}}", of which {{w|artificial intelligence}} and {{w|computer vision}} are sub-disciplines. sustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainablesustainable
+The subtitle refers to "CS", which is a common abbreviation for "{{w|Computer Science}}", of which {{w|artificial intelligence}} and {{w|computer vision}} are sub-disciplines.
 The title text mentions [http://dspace.mit.edu/bitstream/handle/1721.1/6125/AIM-100.pdf The Summer Vision Project] and {{w|Marvin Minsky}} of MIT. In the summer of 1966, he asked his undergraduate student {{w|Gerald Jay Sussman}} to [http://szeliski.org/Book/ "spend the summer linking a camera to a computer and getting the computer to describe what it saw"]. {{w|Seymour Papert}} drafted the plan, and it seems that Sussman was joined by {{w|Bill Gosper}}, {{w|Richard Greenblatt (programmer)|Richard Greenblatt}}, {{w|Leslie Lamport}}, Adolfo Guzman, Michael Speciner, John White, Benjamin, and Henneman - in case the multiple Wikipedia links don't give it away, know that this is a sizable cross-section of the AI researchers of the period. The project schedule allocated one summer for the completion of this task. The required time was obviously significantly underestimated, since dozens of research groups around the world are still working on this topic today.

@@ Line 26: / Line 26: @@
 A month after this comic came out, {{w|Flickr}} [http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/ responded] with a [http://parkorbird.flickr.com/ prototype online tool] to do something similar to what the comic describes, using its automated-tagging software. According to them, the bird solution "took us less than 5 years to build, though it's definitely a hard problem, and we've still got room for improvement".
-Now, years later, the second problem of detecting birds (or any other objects) in the image has also turned into a relatively easy application of existing technologies, because there now exists both open and closed source image classification neural networks after lots of groups have put in the years of research with teams of computer scientists into the problem of computer vision and thanks to recent breakthroughs in neural net architectures.
+Now, years later, the second problem of detecting birds (or any other objects) in the image has also turned into a relatively easy application of existing technologies. There now exist both open- and closed-source image classification neural networks, after many groups have put in the years of research (with teams of computer scientists) into the problem of computer vision, and thanks to recent breakthroughs in neural net architectures.
 ==Transcript==