Feb 062014
0.JotHere.com LXK5HE, by DestinyArchitect N0L9TB, DestinyArchitect creation, Google Drive, {www=web} archive N0LA1L Add comments
MAZBVC:
- MAYCNN: currently updated every ~1 month; see the post history.
- MO3B4H: URL(s): http://1.JotHere.com/4255#N0L48F
- M33YGV: title: url_archive_MAXUKI: a Google Drive folder for community-archiving of content of URLs especially web pages
- MAZBWQ: subsections: WHAT TO DO, WHAT, WHY, WHEN, WHERE, COST, WHO’S PARTICIPATING, ADDITIONAL DETAILS, ADDITIONAL HISTORY, POST TODO, CREATORS, ADDITIONAL DETAILS FOR ORGANIZERS/DEVELOPERS, FOOTNOTES, POST HISTORY.
MEG922: WHAT TO DO
- MRR776: Kindly see & follow WHAT else reply explaining, ideally via comments here.
- N0L8VC: If you aren’t already participating (accessing the archive) but want to, Reply-comment here; write-access is limited.
-end of WHAT TO DO
MNLART: WHAT
- N0L9FX: This documentation at least ~70% compete or possibly more.
- N0L7VS: Stats (as of 2014.02)
- N0N0D1: used by about 6 Google accounts which include virtually all of OCAndroid leadership; I Destiny use it daily.
- N0L7W9: in use now: ~3 years
- N0L8KF: “4555 Files” -and nearly everyone is manually-created archive web pages –that’s a lot of web pages!
- N0L8S0: “.com” websites (url_archive_MAXUKI/com_): qty 79, including biggest “/Meetup_”
- N0L8MW: Size: “459 MB”
- N0N1AX: money cost: $0 (as much less than free quota of Google Drive storage prices)
- N0L8P3: Known data loss: 0!
- N0L4U1: Does “community-archiving of content of URLs especially web pages” fairly easily & fairly reliably.
- N0L6PN: fairly easily
- N0L6QE: Each version of content saved manually almost always via using the web browse’s “Save As” function -no special software required.
- N0L7ZV: Requires use of the excellent Insyc or the equivalent Google Drive client.d
- N0MZ30: once you know how it works, takes (only) ~30 seconds.
- N0L6RX: automatic archiving planned; prior system had it.
- N0MWMW: A TOP CON: Whenever any use of Google Drive that day (as by this), you must fix for Google Drive’s dangerous version retention policy, and I currently only manual method(s) so that’s what I do.
- N0L70R: A TOP CON: Writers must NEVER to delete any content in the archive, most especially content others created or may be referring to.
- N0LAQ7: Archiving a web page you are (about) to and/or have (just) changed:
- N0LAS6: This is seemingly the biggest use of this archive.
- N0LAU0: General procedure, in order:
- N0LAUO: content-before archiving: Just before you make any changes to any page, unless you’re doing 0 edits and only pure additions (notably only adding comments or Greets), archive the page
- N0LBXY: Edit session: make your edits while simultaneously starting content after archiving
- N0LC08: content after archiving: after every about 15 minutes, interrupt whatever you’re doing instead doing an additional archive of the page if certain things:
- N0MXJF: certain things are basically right before you might otherwise loose significant work and 2nd before any response change one might make; specifically if 1 or more of the following:
- N0MXEX: your your edits/additions so far are visible and might result in anyone else making response changes including additions but especially any removals.
- N0MXYL: On the page you have changes archived and at least 1 of the following is true, from most timely:
- N0MY0T: You are about to try making to the page a non-trivial change that may need to be undone.
- N0MXFP: you’ve put in at least about 1 hour of work on these unarchived changes
- N0MXL0: it’s approaching midnight else at least 1 day since your last archive of the page.
- N0MY9S: This finishes the content after archiving for this Edit session.
- N0LCC8: For each edit session, this extra step results in 2 or sometimes-more archives when with an ideal world one would just need the first archive,
- N0MYFX: which is upsetting (though the space to save is generally cheap) and needed as it’s NOT an ideal software world.
- N0MYG8: Might be well fixed by (software) going thru the history of changes and removing duplicate (else similar) save storage while still somehow keeping a note that the content was sampled on this later date and hadn’t changed (or else just the amount of change)
- N0MXJF: certain things are basically right before you might otherwise loose significant work and 2nd before any response change one might make; specifically if 1 or more of the following:
- N0L7E6: For archiving every never-before edited Meetup event listing automatically generated by Meetup (so part of an event series), requires a few extra special steps.
- N0L11X: With every auto-generated (notably auto-series) Meetup event listing,
- N0L1F3: the page initially has a URL with the event “#” containing no numerics, indeed all letters, here www.meetup.com/OCAndroid/events/qlkqkfysdbrb
- N0L1GO: This URL is temporary: if event never happens (as the series is changed/canceled) and sometimes seemingly spontaneously after a few days, Meetup software will dump this URL
- N0L1J8: the moment such listing is edited in the slightest (not just a description/date/name edit, but even a comment or an RSVP)
- N0L1OH: A permanent URL is generated for the event, ending in all numerics, as here http://www.meetup.com/OCAndroid/events/164406282/comments/308438252
- N0L1ON: The prior temp URL is set to redirect to this permanent URL for a few days at least, but after that that redirect ends and the temp URL is possibly resused.
- N0L1P4: so to 1st archive such a page,
- N0MZSB: do in order:
- N0L1QQ: Make a slight change;
- N0L1UF: I recommend doing as I do: post an attendance thread for myself, as “DestinyArchitet ATTENDANCE & REVIEW OF THIS EVENT –post all on that here in this thread<br/>*I am now starting edits of this listing(also generating a permanent URL)<br/>*I plan to attend”
- N0L1VC: This then instantly creates the permanent URL; be sure to do a browser page refresh to see it.
- N0L1U0: archive the page as normal.
- N0L1QQ: Make a slight change;
- N0L1YR: If this was not done in advance, so the page got archived under temporary URL(s)
- N0L7OG: as happened for Martin
- N0L1W8: Before any edits to it, appears you did the _almost always_ the right thing archived it, saving it to a path matching this URL url_archive_MAXUKI/com_/meetup_/www_/OCAndroid/events/qlkqkfysdbrb , creating https://drive.google.com/#folders/0B1iBaZhjEYO4ZnI5S0V6RjhIQzA
- N0L7P3: Motivating me to create this section indeed (finally) this entire article
- N0MZUO: do in order:
- N0L7OG: as happened for Martin
- N0MZSB: do in order:
- N0L1F3: the page initially has a URL with the event “#” containing no numerics, indeed all letters, here www.meetup.com/OCAndroid/events/qlkqkfysdbrb
- N0L11X: With every auto-generated (notably auto-series) Meetup event listing,
- N0L6QE: Each version of content saved manually almost always via using the web browse’s “Save As” function -no special software required.
- N0L6Q5: fairly reliably
- N0L7BY: per the low data loss given the stats there.
- N0L6PN: fairly easily
- N0L599: uses (archives into) “url_archive_MAXUKI: a Google Drive folder”
- N0L60S: URL to folder-path conversion:
- N0L5W0: Real & exemplary example: http://meetup.com/OCAndroid/events/164406282 is archived into folder url_archive_MAXUKI/com_/meetup_/www_/OCAndroid/events/164406282/
- N0L61C: a domain component ends with “_” and components are flipped into biggest-first: example: “www.meetup.com” becomes “com_/meetup_/www_”
- N0L66P: Most every component of the URL, except the protocol (http,https,ftp) has its own folder, and in the order it occurs in the URL except for domains.
- N0L67O: Each variable setting of the URL uses the “&” prefix even if it’s the first setting, as “?name=john” or “&name=john” are both represented “&name=john”
- N0L6EY: Each URL content for a given URL
- N0L6G5: has its own folder (named that URL)
- N0L6HJ: is a file name which:
- N0L6IB: gives a unique ID
- N0L6J4: if HTML, tells
- N0L6KG: how the content was saved, as “HTML[ Only]” or “Complete” or some others
- N0LCHE: “HTML only”
- N0LCK0: should be used unless here proven not to work and other methods (as HTML Complete) form is justified for the much significant space (plus shown to work)
- N0LCKA: works perfectly for all of Meetup.com, especially since Meetup wonderfully never seems to delete all the content its web pages internally link to (pics, CSS files, JavaScript, etc).
- N0LCHE: “HTML only”
- N0L6KQ: browser used to save it, as “Chrome”, “FF”, “IE”
- N0L6KG: how the content was saved, as “HTML[ Only]” or “Complete” or some others
- N0L6LO: notably does NOT tell:
- N0L6M4: Any part of the URL (already covered by its folder path)
- N0L6MP: Any of the content, as the page title (as that’s already in the content and easily & instantly findable via folder content search)
- N0LAWV: The date of the archive (unless a snapshot not to be edited) as that is given by Google Drive/Subversion history.
- N0LB22: each archive content file
- N0LB2O: if NOT a snapshot (the usual case)
- N0LB3D: is to be overwritten with the latest current content but only when once ok
- N0LB5T: but only when once (check for this!) the client(Insync/TortiseSVN) reports that all present content has been successfully archived/checked-in, which it reports via a Green (not red or blue) checkmark when the file is seen in the OS’s normal file explorer, otherwise you will permanently overwrite so loose the last (unarchived) contents of the file
- N0LBB5: then the Google Drive/Subversion will still have the prior version & generally all prior versions
- N0LBK1: which you can access read-only (plus -be careful!- delete)
- N0LBKW: NOT via Insync-or-equivalent client (currently)
- N0LBLK: can via http://drive.google.com then find there the file then right-click.Manage Revisions.
- N0LBK1: which you can access read-only (plus -be careful!- delete)
- N0LB3D: is to be overwritten with the latest current content but only when once ok
- N0LB2O: if NOT a snapshot (the usual case)
- N0L8X6: several more to be documented here.
- N0L5A2: Mostly contains public info
- N0L5B1: Can & sometimes does contain private info
- N0L60S: URL to folder-path conversion:
- N0L52B: This folder, and its contents (unless individually overridden), has access settings “anyone with a link”
- N0L55S: allowing read-only access to by anyone by just giving him/her the link
- N0L57F: because of this plus some content privacy depends on the link not being found past versions private where , only the URL of {low-level aka deep} folders can be posted on the public web, as posting of any higher-level folders, most extremely the root folder URL, would be dangerous to privacy in terms of the amount of content it exposes to the public plus especially since public search engines may crawl it.
-end of WHAT
MNYDBT: WHY
- N0L4JT: Created initially to archive http://Meetup.com content since Meetup.com
- N0L4P0: doesn’t keep past versions of almost any of its content (unlike a wiki)
- N0L4PA: encourages multi-person editing
- N0L4PZ: effectively encorages users to readily destroy each other’s content (as trivial for someone to delete a comment or Greet someone else has written).
- N0L8YO: appears to block archiving by Archive.Org
- N0L4RQ: Extremely useful for archiving all sorts of web pages.
- N0L4UX: Archive.Org only archives every 3 to 6 months and on its own schedule, which usually isn’t frequent enough and won’t work.
- N0L4SZ: I use it as my entire method to archive content of URLs
- d
-end of WHY
MAX1LQ: WHEN
- N0N16X: See N0L7W9
-end of WHEN
MAX1N4: WHERE
-end of WHERE
MAX1VD: COST
-end of COST
MAXDSI: WHO’S PARTICIPATING
-end of WHO’S PARTICIPATING?
M33M3R: ADDITIONAL FUTURE PLANS
- .
-end of FUTURE PLANS
MO37MD: ADDITIONAL HISTORY
- N0L839: Replaces prior version http://0.JotHere.com ‘s JIT_archive_LJNZCF
- N0L888: by storage mechanism:
- N0L933: using Google Drive, mostly dramatically better (notably easier) , but with some real dangers of data loss which Google could fix if they bothered,
- N0L93Q: instead of Subversion, which is
- N0L949: Painful to setup & learn, so generally only by the programmer or very-tech-savvy user
- N0L96H: Painful to use: routinely interrupted to stop & make a checkin else suffer date loss (or use continual webdav checkin but that easily gets out-of-hand permanently using up gobs of storage)
- N0L9AD: Very hard & tricky whenever folders moved & renamed which corrupts the checkout, as semi-regularly happens
- N0L99E: Generally impossible to delete wasted space from unneeded versions
- N0L9A2: Much safer in terms of preventing accidental deletions.
- N0L8C7: similar but improved URL to folder-path conversion
- N0MZ6C: For Subversion I had developed shell scripts for auto-archiving; not yet for Google Drive.
- N0L888: by storage mechanism:
-end of ADDITIONAL HISTORY
MAX22R: ADDITIONAL DETAILS
- .
-end of ADDITIONAL DETAILS
MDE167: POST ADDITIONAL TODO, roughly in order:
- .
-end of POST TODO
MAYJ80: CREATORS
-end of CREATORS
MDAIRC: ADDITIONAL DETAILS FOR ORGANIZERS/DEVELOPERS/HELPERS
- .
MDAIRC: FOOTNOTES
- .
-end of FOOTNOTES
M31R7R: POST HISTORY, in order:
- N0L47O: Motivated by N0L7P3
- N0L438: I Destiny now created this post by Copy to a new draft (of http://1.JotHere.com/4250#N0L3UI (latest)) then gave it fresh IDs & content.
- N0L9GK: drafted 1st version, ~70% complete; usable.
- N0L9O7: add to category q(0.JotHere.com LXK5HE) q(Google Drive M8SLXW)
- N0L9SY: in category q(of DestinyArchitect MA0YLR) add category:
- N0L9ZA: in category q(www=World Wide Web N0C2LO) add category:
- N0LA5H: image cnt: 0 to 1;
- N0LBNX: additions as N0LB22 and N0LAQ7; some fixes
- N0LCTQ: first published; pst2014.02.06Thu1228.
- N0MWUT: Undid main URL lack of 4255 realizing that would likely create problems with copying it; N0MWMW: added; N0LAU0: drastically improved; N0MZ6C: added
- N0N04E: #N0L7E6: convert from pre-tag to std outline, including doing the ExWeb regex replace fr({[0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z]}=) to(<sup><a id=”\1″ class=”aself_KEP2FG”>\1</a>:</sup> );
- N0N133: N0L7VS: move in WHAT from near bottom to near top as a relevant sell/no-sell point; N0N16X, N0N18F, N0N1FX: added; M33YGV: replaced links to links to sections; pst2014.02.07Fri1024.
-end of POST HISTORY