1597: Git

Explain xkcd: It's 'cause you're dumb.
Revision as of 19:19, 31 October 2015 by Kynde (talk | contribs) (tl;dr)
Jump to: navigation, search
Git
If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything.
Title text: If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: Title text - with he name of the friend not explained yet.
If you can address this issue, please edit the page! Thanks.

This comic is a play on how git, a popular version control system, is misused by people who have a very poor understanding of its inner workings. Git is a particularly apt target for such a joke due to its widespread use and significant discrepancy between perceived simplicity and complex underlying design. Tutorials for git tend to make extensive use of a cozy bootstrap layout and deal only with the most basic commands to get started, which can fool a novice to believe that git can be used appropriately without extensive studying. As this is rarely the case, a large group of git users (including Cueball) have a knowledge of git that extends to memorizing set of commands rather than a conceptual understanding of what those commands actually do. As this habit eventually will lead to a corrupt working tree, Cueball suggests that Ponytail keeps an alternative copy of her project outside git which, of course, defies every purpose of employing a version control system to begin with.

Git is a version control system often used to track changes to (usually) plain text files, such as computer code. Within a folder and its subfolders, the user can tell Git which files to keep track of changes for. All the files that are being tracked in this manner make up a repository. Internally, Git works by saving entire snapshots of the files hashed by contents so that the same file content is only stored once, rather than creating a new copy each time the user "commits" the current version of the code. This approach allows the user to switch between various versions of the code fairly quickly. However, this can be confusing for new users because when changing between versions, Git effectively rewrites the files under its control to match that version - one file may have several different versions depending on which state Git has set it to, but only one of these versions is visible at any given moment. The others are not hidden or moved, they do not exist until Git modifies the file to match that version.

In addition to allowing the user to track changes to the files over time using "commits" (versions of the files stored by the user), Git also allows the user to develop several versions of the files in parallel using "branches" (mentioned in the title text). This allows a programmer to, for example, keep a stable, functioning version of their code in one branch, while developing a new feature in a separate branch. When the new feature is ready, Git provides tools to efficiently "merge" the changes from the development branch back into the main branch. While powerful, there are also several pitfalls which can confuse users. For example, a file may have only been committed in one branch (so it is only visible in that branch), causing a user who has switched to a different branch to think that file was lost somehow.

Sharing a Git repository with other users is done through a remote repository, such as GitHub, GitLab, or one set up by the user themselves. This remote repository acts as a central location through which collaborators share their work. Changes do not automatically propagate between users; instead, once someone has changes they are ready to share, they must upload ("push" in Git terminology) their changes to the remote repository. Other users can then download ("pull") those changes. This allows each user complete control over when changes are applied to their version of the files. Once one user has pushed his or her changes, all other users will need to merge those changes into their code before they can push. Depending on how much the changes conflict, Git may be able to automatically combine both users' versions, or the user may need to do so manually.

In programming, Git is a very popular way to share source code of programs between computers and users and thus work on projects collaboratively.

However, problems often arise when, for example, one attempts to upload code to a file someone else has already edited. Git has quite a few tricks to handle "merging" itself.

One way of simplifying collaboration is to work in a different "branch" than other collaborators. All branches independently track their changes and can be viewed independently of each other. Only when you successfully "merge" (there we go again) individual branches together will you see other collaborators' "commits" in your working set of files.

But, due to the complex nature of Git (and its notoriously counter-intuitively named commands), a large portion of users are unable to use it beyond basic commands. They consider it usually much more efficient just to save the code to a different file, download a newer copy, and then re-apply their original changes to the new copy than to try and understand and use Git's own convoluted built-in commands to attempt to fix it properly.

Git was originally created by Linus Torvalds, the same person who originally created Linux.

tl;dr

The explanation above was written by that friend whose name is in git.txt, and gives a good idea of what you need to wait through before he tells you the commands you need. In short: programmers use version control systems to track changes to code. Most of these version control systems are quite similar and easy to learn if you already know another one. Git is a version control system based on completely different principles, and most programmers find it difficult to wrap their heads around it. Cueball is one of those programmers.

Transcript

[Cueball points to a computer on a desk while Ponytail and Hairy are standing further away behind an office chair.]
Cueball: This is Git. It tracks collaborative work on projects through a beautiful distributed graph theory tree model.
Ponytail: Cool. How do we use it?
Cueball: No idea. Just memorize these shell commands and type them to sync up. If you get errors, save your work elsewhere, delete the project, and download a fresh copy.


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

If someone is interested, the best book I've read on it is Pro Git. The chapters 2 and 3 explain pretty well this mess of branching and merging. But it's true that it takes a bit of patience to go over it all. 108.162.228.35 08:47, 30 October 2015 (UTC)

Also take a look at GitFlow: A Successful Git Branching Model. Though Randall is correct there usually comes a time when it is easier to give up and "start again". 162.158.34.147 08:53, 30 October 2015 (UTC)

I never liked the name of this piece of software; in British English, the name "git" is mildly rude :-) https://en.wikipedia.org/wiki/Git_(slang) . Gearóid (talk) 09:20, 30 October 2015 (UTC)

According to word of god it was on purpose: https://en.wikipedia.org/wiki/Git_(software)#History 162.158.22.46 11:41, 30 October 2015 (UTC)
He also designed it in such a way that people often run into problems with commitment to detached heads, and typically deal with this by reflogging... 108.162.249.161 (talk) (please sign your comments with ~~~~)

'Internally, Git works by saving the differences between various versions of the files, rather than creating a new copy each time the user "commits" the current version of the code.' - It is exactly the opposite. It stores whole files, or rather all committed pieces of data (blobs). See http://gitready.com/beginner/2009/02/17/how-git-stores-your-data.html 141.101.88.202 09:38, 30 October 2015 (UTC)TK

It is stored as diffs in pack file. Whole file (loose object) are packed automatically by default.
See https://schacon.github.io/gitbook/7_the_packfile.html and https://www.kernel.org/pub/software/scm/git/docs/git-pack-objects.html

162.158.177.59 10:15, 30 October 2015 (UTC)

Not sure what pack files are used for, but data is stored as is and named by the SHA-1 of its contents. See object model in the same reference. Walenc (talk) 16:02, 30 October 2015 (UTC)

I think you guys need to differentiate between the underlying data scheme, and the command line. The way git stores underlying data is indeed beautiful, but the command-line is the worst UI ever. You know how you switch to working on a different branch? "git checkout". You know how you revert the changes you've made to a file? "git checkout". You know how you make a new branch? "git checkout -b". If you're used to other systems, you'll find nearly every operations - even common ones - counterintuitively named. I work at Google and even here, every week someone near me screws up their respository enough that they have to save their work, nuke their repo, reapply their changes, and try moving forward again. I don't know why anyone puts up with this! (Actually I do - it's because if you're collaborating between companies, git does it better than anything else.) 199.27.129.107 18:46, 2 November 2015 (UTC)

That's not actually true. git checkout takes you to a node of development, as a convenience that can be either the entire code base (a branch) or a single file. You could remove the file you want to 'revert', stash all other changes, checkout HEAD and then pop the stash...or use the git checkout FILE shortcut. git checkout -b is just a shortcut so you don't have to do git branch; git checkout. 108.162.220.239 06:00, 7 November 2015 (UTC)

I feel like this article should end with a quick guide to git commands. 108.162.216.27 (talk) (please sign your comments with ~~~~)

Well, I feel this article focuses on explaning git too much that it loses the point of the joke. We have Wikipedia to refer readers to ... The thing is, not just users who are unable to use git beyond a few basic commands, but also those who understand git often use some sort of "start over" method because an action looking perfectly legit got the repository into unusable state, where recovery is much more difficult than reapplying patches. For one of the most common, search for "detached head", for example - especially funny when git insists on falling into that state after checking out master which is in direct contradiction to what docs say when it happens. But I don't feel like rewriting that, sorry :-/ --kavol, 141.101.96.206 16:04, 30 October 2015 (UTC)

I feel you've all been nerd-sniped. 108.162.216.8 19:33, 30 October 2015 (UTC)Pat

The problem is not about the working copy and about the branching tree structure and some git internals that is quite confusing. This 4 years old reddit post can be used as a funny reference: https://www.reddit.com/r/programming/comments/embdf/git_complicated_of_course_not_commits_map_to/

http://tartley.com/?p=1267 "One of the things that tripped me up as a novice user was the way Git handles branches. Unlike more primitive version control systems, git repositories are not linear, they support branching, and are thus best visualised as trees, upon the nodes of which your current commit may add new leaf nodes. To visualise this, it’s simplest to think of the state of your repository as a point in a high-dimensional ‘code-space’, in which branches are represented as n-dimensional membranes, mapping the spatial loci of successive commits onto the projected manifold of each cloned repository." 108.162.210.212 (talk) (please sign your comments with ~~~~)

Should someone mention how git is by default used through a terminal - which is often more confusing than a GUI for most people - and that while there are graphical shells for git, some people refuse to use them because they're not fully-featured? 108.162.221.36 11:43, 30 October 2015 (UTC)

The really sad part of all this is that if you work in a multi-dev environment and anyone on the team is doing what Cueball suggests, it negates every other user's ability to use the main trunk properly. Ericm301 (talk) 02:26, 31 October 2015 (UTC)

Hasn't it got too extensive about git? I've never used git but quite understood the comedy. I just visited this page to know about git.txt and there's nothing about it but just long text that doesn't help whatsoever to understand the comic. 141.101.84.125 08:45, 31 October 2015 (UTC)

I agree completely! I've stripped out the overlong discussion of git's features. --Slashme (talk) 00:12, 1 November 2015 (UTC)

AFAIK, the git.txt is not the part of the Git itself. I just added it to explanation. 162.158.114.231 20:21, 31 October 2015 (UTC)

"This comic is a play on how git, a popular version control system, is misused by people who have a very poor understanding of its inner workings."

Comically missing the point. That is NOT what the comic is about, that is a poor excuse from a fanboy. --162.158.90.159 12:00, 1 November 2015 (UTC)

I agree the verbose "explanation" misses the point. The reality is that git is a confusing mess from a user's point of view. It's a very nice and powerful design from a technical point of view yet one that will mostly confuse anyone who encounters it at first; most people are afraid of admitting it because they don't want to look dumb. There's beauty in a design that is user-friendly at its core, and git misses that mark. Ralfoide (talk) 17:38, 1 November 2015 (UTC)
The same can be said of Linux. It seems to be a common theme in Linus Torvalds' work. 108.162.249.163 23:52, 1 November 2015 (UTC)

In pretty much every team I've worked I found there ends up being one "git expert" that raises above the rest and people continuously go see that person with "I don't know how to do X", to which the expert will often reply with a magic unheard-of-before git command line that looks pretty much like perl line noise. Ralfoide (talk) 17:38, 1 November 2015 (UTC)

In what world are telephones not an electronic mean of communication ? 141.101.75.245 10:56, 2 November 2015 (UTC)

That's not the point. The distinction was being made (ambiguously, perhaps) between electronic and vocal communication. We might naturally turn to telephones for the latter.--162.158.2.227 12:16, 2 November 2015 (UTC)

ExplainXKCD is usually amazing, but the explanation above is really "comically missing the point".

Git has a very cool distributed architecture, but the user experience is much more complex than other revision control systems. TFS and subversion can be taught to junior developers in about 20 minutes, but it takes much longer to learn how to use Git’s basic features. It is very easy for Git to become deadlocked, which requires some obscure commands to fix. Unless you are an expert at Git, it is sometimes easier to delete your project and try again.
There are things that Git does that other RCS don’t do. (I am not entirely sure what they are, to be totally honest. When the question is asked, the responses usually just talk about the architecture.) Git experts tend to like that the software is more powerful than other RCS systems, and some tend to be dismissive of how difficult other people find it to use. Many people (such as myself and Cueball) find the architecture cool, but are not Git experts.
So this is the joke. There is a conflict between how experts typically TALK about Git, and how most users actually USE Git. The humor comes from having a character say things that many people think, but wouldn’t say out loud for fear of looking stupid.

Would it be worth polishing the above and adding it to the description, or would that just be flamebait? 108.162.246.86 16:08, 2 November 2015 (UTC)

The title text may be referring to the famous saying: "Git is really pretty simple, just think of branches as homeomorphic endofunctors mapping submanifolds of a Hilbert space." 162.158.255.40 23:23, 2 November 2015 (UTC)

The current explanation is wrong [not anymore, it's excellent now!]. As others have stated, the comic is clearly making fun of git itself, NOT of its users. Daskas (talk) 13:44, 3 November 2015 (UTC)

Wow, it's amazing how there are comments above defending git: those commentators lost the fact that XDCD is making fun of git because of it's idealistic view of source control doesn't map at all to reality, which in many cases, leads to user frustration and... dare I say it, lost data and lost productivity. Git is a joke and XKCD highlighted that well :) 162.158.60.5 20:35, 21 December 2015 (UTC)

Someone made a website to be that "smart guy on the other end of the phone." The final entry on the page is this comic for sure.--Draco18s (talk) 16:17, 12 September 2016 (UTC)

I'd like to recommend a site I found on a recent (at the time of this comment) CS Educator stackexchange post. 108.162.216.106 05:33, 25 July 2017 (UTC)