You are here

python

Failure to apply LDAP pages results control.

On a particular instance of OpenGroupware Coils the switch from an OpenLDAP server to an Active Directory service - which should be nearly seamless - resulted in "Failure to apply LDAP pages results control.". Interesting, as Active Directory certainly supports paged results - the 1.2.840.113556.1.4.319 control.

But there is a caveat! Of course.

LDAP Search For Object By SID

All the interesting objects in an Active Directory DSA have an objectSID which is used throughout the Windows subsystems as the reference for the object. When using a Samba4 (or later) domain controller it is possible to simply query for an object by its SID, as one would expect - like "(&(objectSID=S-1-...))". However, when using a Microsoft DC searching for an object by its SID is not as straight-forward; attempting to do so will only result in an invalid search filter error.

Deduplicating with group_by, func.min, and having

You have a text file with four million records and you want to load this data into a table in an SQLite database. But some of these records are duplicates (based on certain fields) and the file is not ordered. Due to the size of the data loading the entire file into memory doesn't work very well. And due to the number of records doing a check-at-insert when loading the data is also prohibitively slow. But what does work pretty well is just to load all the data and then deduplicate it.

Concerning Decorators

Decorators are a powerful as well as arcanely constructed feature of Python. Over at his BLOG Graham Dumpleton has written a series of [8 so far] articles covering Python decorator's in-depth. Lots of information here along with abundant examples.

Tags: 

Stream Peekaboo With Python

The Python standard libary provides a reasonably adequate module for reading delimited data streams and there are modules available for reading everything from XLS and DIF documents to MARC data. One definiciency of many of these modules is the ability to gracefully deal with whack data; in the real world data is never clean, never correctly structured, and you are lucky if it is accurate even on the rare occasion that it is correctly encoded.

For example, when Python's CSV reader meets a garbled line in a file it throws an exception and stops, and you're done. And it does not report what record it could not parse, all you have is a traceback. Perhaps in the output you can look at the last record and guess that the error lies one record beyond that... maybe.

Fortunately most of these modules work with file-like objects. As long as the object they receive properly implements iteration they will work. Using this strength it is possible to implement a Peekaboo on the input stream which allows us to see what the current unit of work being currently processed is, or even to pre-mangle that chunk.

Paramiko's SFTPFile.truncate()

Paramiko is the go-to module for utilizing SSH/SFTP in Python. One one the best features of Paramiko is being able to being able to SFTPClient.open() a remote file and simply use it like you would use a local file. SFTPClient's open() returns an SFTPFile which is a file-like object that implements theoretically the same behavior as Python's native file object.

But the catch here is file-like. It is like a file, except when it is not like a file.

Encoding sambaNTPassword With Python

Samba's sambaNTPassword attribute, which mimics the corresponding NT / Active Directory attribute, has a value that must be a hex encoded MD4 hash of the user's password with a UTF-16 encoding. Fortunately generating such a string is a Python one-liner.

import hashlib

password = 'fred123'
nt_password = hashlib.new('md4', password.encode('utf-16le')).digest().encode('hex').upper()

Note that Samba wants all the alpha characters in the string as upper-case.The result will always be 32 characters long.

Discovering DLL Version With pefile

A Microsoft KB article claimed that if a specific DLL was at least a certain version that a bug reported by one of my users would be resolved. But the user was using their computer and I dislike interrupting people's work (I know how annoying it is when someone interrupts me). No problem; I can just grab the named DLL off their machine over the network and copy it to my home directory. But I'm not running Windows and all file tells me is that the DLL is a 32-bit PE file.

XSLT Transform to TXT, with LXML

Maybe this should be obvious, but it wasn't to me. I've got an XML document and an XSLT stylesheet. But that stylesheet just produces text, not XML; it is essentially a template for an e-mail. So I was extending OIE's transformAction for performing XSLT transforms that produce other than XML... but the documentation is a bit thin and every example is XML results. The trick is pretty simple, just

unicode(result)

and make sure [of course] that you have

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer