ArchiveBox or similar for shared archiving of research project

Stopwatch1986@lemmy.ml · 4 days ago

In other news (well, from April): https://post.parliament.uk/facial-recognition-technology-in-policing

Stopwatch1986@lemmy.ml · 5 days ago

Thanks for doingthe digging. An archivist may know something more. Or the archive.is people.

Stopwatch1986@lemmy.ml · 5 days ago

I have been using Zotero every day for more than two decades and somehow it hasn’t cross my mind. You may be on to something.

Zotero supports public and private shared bibliographies that you can subscribe to through the client or their web interface. Each entry contains the bibliographical details, notes attachments, file attachments and links to local files. It also captures webpages and metadata through the browser addon. The local database can be backed up and, if self-hosted, you have control. The best part is that academic researchers will be familiar with the software and process. One downside is that the cached file is not independently archived so it could be tampered with. Thanks for the idea.

Stopwatch1986@lemmy.ml · 5 days ago

A wiki is a good idea. Putting a Singlefile or similar all-in-one file in a repository and provide index numbers organised as a look-up table would also work for easy retrieval by a random research user. Both require some admin and more effort from the researchers.

I wish there was a hostable version of archive.is for near-zero maintenance. You just submit a URL over the internet and the web page is cached once along with a screenshot. Then, anyone can access the archived version. This can be done already with archive.is but we have no control over its future, which is critical for long-term dependable archiving.

Stopwatch1986@lemmy.ml · 6 days ago

One advantage and disadvantage of having webrecorder host our archived pages is that the archive may survive longer than, or not as long as our project.

I have been using singlefile for years. It’s great but not for seamlessly making cached web pages available to the general public reading our reports and finding that cited links are now dead. And it doesn’t support URLs point to PDF, CSV files. A public-facing repository of singlefile files with an index for ToC might do it though. Simplicity is good for future-proofing an archive.

Something like archive.org and archive.is would be ideal, but we have no control over its future and practices.

Stopwatch1986@lemmy.ml · 6 days ago

I wonder if an authorised remote user (ie an affiliated researcher) can easily instruct ArchiveBox to store a URL and later retrieve it. Also, ideally a random user should be able to retrieve the archived web page or file (eg a PDF, CSV etc). The idea is that authorised researchers can get URLs archived, and then any user reading our reports can click on a citation and get our archived source if the original is not available any more. I’ll need to run it and see, but it looks promising.

Keeping the archive alive for years later, possibly after funding dries up, is another challenge but there are public repositories that may be suitable for that.

Stopwatch1986@lemmy.ml · 7 days ago

ArchiveBox or similar for shared archiving of research project

Stopwatch1986@lemmy.ml · 15 days ago

Scientists Invented a Disease to Test Whether A.I. Knew It Was Fake. Then, Chatbots Started Saying It Was Real

Stopwatch1986@lemmy.ml · 21 days ago

The implication is that sending links to encrypted files with the decryption key added to the URL (eg Thunderbird Send, Mega etc) is not zero-trust. Decryption may take place locally and the key part of the URL may not be sent to the file hosting service, but when the recipient clicks on the link and is served one-off code by the web site, that code may be compromised.

As we know, the best way to be sure is to do your own separate encryption but without secure-by-design most people will think you are very odd demanding that decryption is done separately and keys are shared through a different channel. Speaking from experience, no matter how much training they are given at work, most people, including HR, would rather you sent them sensitive documents (like passport scans) in the clear as email attachments or at least in a way that involves a single click (Wetransfer etc).

Stopwatch1986@lemmy.ml · 21 days ago

Zero-trust services and web access

Stopwatch1986@lemmy.ml · edit-2 3 months ago

In addition to wifi, Bluetooth beacons would be good too.

Seeing the same SSIDs (eg in a cinema) might also mean you are not moving, but then how can you tell you are not sitting near another train passenger with their hotspot on?

Stopwatch1986@lemmy.ml · 3 months ago

Jepster on Google Play was good but from v8.0 it won’t start if, like me, you have Google Play Store disabled. Presumably, they need that for the optional in-app purchases but they never replied to my email so I don’t know.

From FLOSS, I am experimenting with FitoTrack which looks promising. Another one is AAT.

Colota is great for general self-tracking.

Stopwatch1986@lemmy.ml · edit-2 3 months ago

I’ve been using this for a few weeks and it’s great. In addition to offline-first, it would be nice to be able to ask Colota: List my trips between date1 and date2 when I was near (ie within x meters from) point y.

I am planning to use this for a long time too, so an export/import data for when I change my phone would be nice. I see Export but not Import.

Also, being able to delete trips between date1 and date2 would be useful. Currently, you can delete 1-by-1 or recent trips only.