This reminds me of the now very cliche Niven's law: "Any sufficiently advanced technology is indistinguishable from magic" [no, that is not a quote from Arthur C. Clarke, as commonly attributed]. I could not longer resist looking behind the curtain, so I set off to discover how my infallible secretary accomplishes this. The answer is "GVFS" - the GNOME Virtual Filesystem which layers an extensible meta-data system on top of application I/O.
GVFS provides a command line tool, of course [this is UNIX!], that allows the savy user to see into the filing cabinet of their infallible secretary.
$ gvfs-info -a "metadata::*" file:///home/awilliam/Documents/Informix_Python_Carsten_Haese.pdfAnd there it is - "metadata::evince::page: 7" - how Evince takes me back to the same page I left from. As well as lots of other information.
Command line tools are indespensible, but the immediate next question.... can I access this data from Python? Answer - of course! With the GIO module the data is there ready to be explored.
>>> import gio
>>> handle = gio.File('/home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf')
>>> meta = handle.query_info('metadata')
Now knowing that, the System Administrator part of my psyche needs to know: where is all this metadata? His first guess what that it was being stored in the filesystems using extended attribites:
getfattr --dump "/home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf"Bzzzzt! Nothing there. Enough with guessing, every System Administrator worth his or her salt knows that guessing [ugh!] is for PC jockeys and web developers. The correct approach is to drag the appropriate application out to the sheds and ... make it talk. It turns out that gvfs-info doesn't put up much of a fight - one glimpse of strace and he's confessing everything.
$ strace -e trace=open gvfs-info -a "metadata::*" "file:///home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf"Yes, there it is.
open("/home/awilliam/.local/share/gvfs-metadata/home", O_RDONLY) = 6
$ file /home/awilliam/.local/share/gvfs-metadata/homeA memory-mapped database file [see the "m" after the PID in the output of fuser - that means memory mapped]. And PIDs or of the applications currently performing operations via GIO. The use of memory mapped files means that read operations require no IPC [inter-process communications] or even syscalls for multiple applications to see the same state. Now I had to do a little digging for GVFS documentation to understand how they manage concurrency - as multiple writers to memory mapped files is a dicey business [and GIO applications feel rock solid]. The answer is the gvfsd-metadata process. Applications using GIO push all there writes / changes to that process over D-BUS; so only one process writes, everyone else reads through the memory mapped file. Concurrency issues are elegantly side-stepped. Brilliant.
$ fuser -u /home/awilliam/.local/share/gvfs-metadata/home
/home/awilliam/.local/share/gvfs-metadata/home: 2517m(awilliam) 2678m(awilliam) 26624m(awilliam)
$ ps -p 2517
PID TTY TIME CMD
2517 ? 00:08:13 nautilus
$ ps -p 2678
PID TTY TIME CMD
2678 ? 00:01:56 gvfsd-metadata
$ ps -p 26624
PID TTY TIME CMD
26624 ? 00:00:17 gedit
Now that the geek in me is sated I can go back to letting GNOME and its infallible secretary facilitate my productivity.