Editing Talk:2304: Preprint

Jump to: navigation, search
Ambox notice.png Please sign your posts with ~~~~

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 18: Line 18:
 
:As someone who regularly takes tables ''from'' PDF in order to put them into spreadsheets for further use, some people don't do me any favours by that method. Among the problems, if the table setter didn't pay attention to the column widths then the copied-out text of two adjacent cells that don't ''appear'' to overlap each other will interlace at a character level and need editing back to separate entites. And then there's the inconsistencies of Header rows atop the table and/or atop the next newpage the table splits over. I could run a quick script on (X)HTML tables, and get it perfectly for my needs. CSV, or even TabSV, would actually be my preferred transport format (i.e. ''no'' format, just pure layout without even spanned/merged cells, and I can redo what needs redoing on the final redo), but I can't ever seem to get them to do that for me despite having the data almost in that form prior to the PDFing... Grrrr. [[Special:Contributions/162.158.159.142|162.158.159.142]] 11:30, 10 May 2020 (UTC)
 
:As someone who regularly takes tables ''from'' PDF in order to put them into spreadsheets for further use, some people don't do me any favours by that method. Among the problems, if the table setter didn't pay attention to the column widths then the copied-out text of two adjacent cells that don't ''appear'' to overlap each other will interlace at a character level and need editing back to separate entites. And then there's the inconsistencies of Header rows atop the table and/or atop the next newpage the table splits over. I could run a quick script on (X)HTML tables, and get it perfectly for my needs. CSV, or even TabSV, would actually be my preferred transport format (i.e. ''no'' format, just pure layout without even spanned/merged cells, and I can redo what needs redoing on the final redo), but I can't ever seem to get them to do that for me despite having the data almost in that form prior to the PDFing... Grrrr. [[Special:Contributions/162.158.159.142|162.158.159.142]] 11:30, 10 May 2020 (UTC)
 
:: I feel your pain.  I receive pdf documents from a financial professional, where an A4 landscape page seems to have about five two-column-wide tables side-by-side, and I'm still deciding what kind of manipulation to do, to get it into CSV and do some analysis. [[Special:Contributions/162.158.6.232|162.158.6.232]] 10:21, 12 May 2020 (UTC)
 
:: I feel your pain.  I receive pdf documents from a financial professional, where an A4 landscape page seems to have about five two-column-wide tables side-by-side, and I'm still deciding what kind of manipulation to do, to get it into CSV and do some analysis. [[Special:Contributions/162.158.6.232|162.158.6.232]] 10:21, 12 May 2020 (UTC)
βˆ’
::: ''If'' the PDFing hasn't ruined the groupings/precedence, like it often does, try mouse-selecting each table, to copy and paste into notepad or equivalent. Sometimes that works well enough to create tab delimited elements (other times, it line-feeds between columns as well as rows, but still can be reconstructed) and then that'll paste into a spreadsheet (or be parsable with a script) better than any Paste Special (using "no textformat" options?) straight into a grid. Sometimes you need to fiddle a bit with the notepad text, but depending on the data that might be doable with a few choice find+replace runs, perhaps upon consecutive table-pastings to save you time repeating yourself. Or not. [[Special:Contributions/162.158.158.163|162.158.158.163]] 00:08, 13 May 2020 (UTC)
 
  
 
I think Randall's last point (no unprofessional humans use PDFs in 2020) is very wrong. Especially due to the coronavirus, all college classes have switched to online assignment submissions, and the teachers only accept PDF submissions (although, annoyingly, they give the original template files in .doc format!) I would NOT trust random college student's assignment submissions as a reputable information source! [[User:PotatoGod|PotatoGod]] ([[User talk:PotatoGod|talk]]) 17:22, 10 May 2020 (UTC)
 
I think Randall's last point (no unprofessional humans use PDFs in 2020) is very wrong. Especially due to the coronavirus, all college classes have switched to online assignment submissions, and the teachers only accept PDF submissions (although, annoyingly, they give the original template files in .doc format!) I would NOT trust random college student's assignment submissions as a reputable information source! [[User:PotatoGod|PotatoGod]] ([[User talk:PotatoGod|talk]]) 17:22, 10 May 2020 (UTC)

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)