You are here

Interrogating the Infallible Secretary

Numerous applications in GNOME exhibit magically wonderful behavior, like they remember everything and know what you want. One example of such an application is the excellent PDF reader [Evince](http://projects.gnome.org/evince/; every time I open a PDF it opens to the same page as the last time I looked at that document. This means if I get my morning coffee, switch to the GNOME Activity Journal, see that it was the document "Informix_Python_Carsten_Haese.pdf" that I was reading at 16:59 the previous day, I click on that document and it opens to the same slide it was displaying when I closed it the previous day. And GNOME applications do this kind of thing all day, like an infalliable secretary.

This reminds me of the now very cliche Niven's law: "Any sufficiently advanced technology is indistinguishable from magic" [no, that is not a quote from Arthur C. Clarke, as commonly attributed]. I could not longer resist looking behind the curtain, so I set off to discover how my infallible secretary accomplishes this. The answer is "GVFS" - the GNOME Virtual Filesystem which layers an extensible meta-data system on top of application I/O.

GVFS provides a command line tool, of course [this is UNIX!], that allows the savy user to see into the filing cabinet of their infallible secretary.

$ gvfs-info -a "metadata::*" file:///home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf
attributes:
  metadata::evince::page: 7
  metadata::evince::dual-page-odd-left: 0
  metadata::evince::zoom: 1
  metadata::evince::window_height: 594
  metadata::evince::sizing_mode: fit-width
  metadata::evince::sidebar_page: links
  metadata::evince::window_width: 1598
  metadata::evince::sidebar_size: 249
  metadata::evince::dual-page: 0
  metadata::evince::window_x: 1
  metadata::evince::window_y: 91
  metadata::evince::show_toolbar: 1
  metadata::evince::window_maximized: 0
  metadata::evince::inverted-colors: 0
  metadata::evince::continuous: 1
  metadata::evince::sidebar_visibility: 1
  metadata::evince::fullscreen: 0

And there it is - "metadata::evince::page: 7" - how Evince takes me back to the same page I left from. As well as lots of other information.

Command line tools are indespensible, but the immediate next question.... can I access this data from Python? Answer - of course! With the GIO module the data is there ready to be explored.

>>> import gio
>>> handle = gio.File('/home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf')
>>> meta = handle.query_info('metadata')
>>> meta.has_attribute('metadata::evince::page')
True
>>> meta.get_attribute_string('metadata::evince::page')
'7'

Now knowing that, the System Administrator part of my psyche needs to know: where is all this metadata? His first guess what that it was being stored in the filesystems using extended attribites:

getfattr --dump "/home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf"

Bzzzzt! Nothing there. Enough with guessing, every System Administrator worth his or her salt knows that guessing [ugh!] is for PC jockeys and web developers. The correct approach is to drag the appropriate application out to the sheds and ... make it talk. It turns out that gvfs-info doesn't put up much of a fight - one glimpse of strace and he's confessing everything.

$ strace -e trace=open gvfs-info -a "metadata::*" "file:///home/awilliam/Documents/Informix_Python_Carsten_Haese.pdf"
...
open("/home/awilliam/.local/share/gvfs-metadata/home", O_RDONLY) = 6

Yes, there it is.

$ file  /home/awilliam/.local/share/gvfs-metadata/home
/home/awilliam/.local/share/gvfs-metadata/home: data
$ fuser -u /home/awilliam/.local/share/gvfs-metadata/home
/home/awilliam/.local/share/gvfs-metadata/home:  2517m(awilliam)  2678m(awilliam) 26624m(awilliam)
$ ps -p 2517
 PID TTY          TIME CMD
 2517 ?        00:08:13 nautilus
$ ps -p 2678
 PID TTY          TIME CMD
 2678 ?        00:01:56 gvfsd-metadata
$ ps -p 26624
 PID TTY          TIME CMD
 26624 ?        00:00:17 gedit

A memory-mapped database file [see the "m" after the PID in the output of fuser - that means memory mapped]. And PIDs or of the applications currently performing operations via GIO. The use of memory mapped files means that read operations require no IPC [inter-process communications] or even syscalls for multiple applications to see the same state. Now I had to do a little digging for GVFS documentation to understand how they manage concurrency - as multiple writers to memory mapped files is a dicey business [and GIO applications feel rock solid]. The answer is the gvfsd-metadata process. Applications using GIO push all there writes / changes to that process over D-BUS; so only one process writes, everyone else reads through the memory mapped file. Concurrency issues are elegantly side-stepped. Brilliant.

Now that the geek in me is sated I can go back to letting GNOME and its infallible secretary facilitate my productivity.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer