- From: Karl Dubost <karl@la-grange.net>
- Date: Thu, 20 Jun 2013 11:37:43 -0400
- To: www-archive Archive <www-archive@w3.org>
XHTML Files in Mac OS X are not indexed the same way that HTML files are. But there is a solution. # Before XHTML aware. Here an example of an XHTML file and what is known by Spotlight. The information is very basic. Nothing related to the content of the file. → mdls bnf.xhtml kMDItemContentCreationDate = 2011-10-01 11:47:27 +0000 kMDItemContentModificationDate = 2013-01-07 00:56:56 +0000 kMDItemContentType = "public.xhtml" kMDItemContentTypeTree = ( "public.xhtml", "public.xml", "public.text", "public.data", "public.item", "public.content" ) kMDItemDateAdded = 2011-10-01 11:47:27 +0000 kMDItemDisplayName = "bnf.xhtml" kMDItemFSContentChangeDate = 2013-01-07 00:56:56 +0000 kMDItemFSCreationDate = 2011-10-01 11:47:27 +0000 kMDItemFSCreatorCode = "" kMDItemFSFinderFlags = 0 kMDItemFSHasCustomIcon = 0 kMDItemFSInvisible = 0 kMDItemFSIsExtensionHidden = 0 kMDItemFSIsStationery = 0 kMDItemFSLabel = 0 kMDItemFSName = "bnf.xhtml" kMDItemFSNodeCount = 6447 kMDItemFSOwnerGroupID = 502 kMDItemFSOwnerUserID = 502 kMDItemFSSize = 6447 kMDItemFSTypeCode = "" kMDItemKind = "HTML" kMDItemLogicalSize = 6447 kMDItemPhysicalSize = 8192 # MODIYING mdimporter. * Go to /System/Library/Spotlight * find the RichText.mdimporter * Right-click on it and choose "Show Package Contents". * Inside the folder, edit with your text editor (textmate, sublime, etc.) the info.plist file or something along sudo subl /System/Library/Spotlight/RichText.mdimporter/Contents/Info.plist * You will see something along: <array> <string>public.rtf</string> <string>public.html</string> <string>public.xml</string> <string>public.plain-text</string> <string>com.apple.traditional-mac-plain-text</string> <string>com.apple.rtfd</string> <string>com.apple.webarchive</string> <string>org.oasis-open.opendocument.text</string> <string>org.openxmlformats.wordprocessingml.document</string> </array> * Edit it to add <string>public.xhtml</string> <array> <string>public.rtf</string> <string>public.html</string> <string>public.xhtml</string> <string>public.xml</string> <string>public.plain-text</string> <string>com.apple.traditional-mac-plain-text</string> <string>com.apple.rtfd</string> <string>com.apple.webarchive</string> <string>org.oasis-open.opendocument.text</string> <string>org.openxmlformats.wordprocessingml.document</string> </array> * Save it # REINDEXING To reindex a file you can just use mdimport → mdimport bnf.xhtml # LET'S look again at the data. → mdls bnf.xhtml kMDItemContentCreationDate = 2011-10-01 11:47:27 +0000 kMDItemContentModificationDate = 2013-01-07 00:56:56 +0000 kMDItemContentType = "public.xhtml" kMDItemContentTypeTree = ( "public.xhtml", "public.xml", "public.text", "public.data", "public.item", "public.content" ) kMDItemDateAdded = 2011-10-01 11:47:27 +0000 kMDItemDisplayName = "bnf.xhtml" kMDItemFSContentChangeDate = 2013-01-07 00:56:56 +0000 kMDItemFSCreationDate = 2011-10-01 11:47:27 +0000 kMDItemFSCreatorCode = "" kMDItemFSFinderFlags = 0 kMDItemFSHasCustomIcon = 0 kMDItemFSInvisible = 0 kMDItemFSIsExtensionHidden = 0 kMDItemFSIsStationery = 0 kMDItemFSLabel = 0 kMDItemFSName = "bnf.xhtml" kMDItemFSNodeCount = 6447 kMDItemFSOwnerGroupID = 502 kMDItemFSOwnerUserID = 502 kMDItemFSSize = 6447 kMDItemFSTypeCode = "" kMDItemKeywords = ( livre, "bibliothe\U0300que", lutte, carnet ) kMDItemKind = "HTML" kMDItemLogicalSize = 6447 kMDItemPhysicalSize = 8192 kMDItemTitle = "Numérisation des livres de la BNF - Carnets de La Grange" So we can see now that the data have the title and the keyword. And so become searchable. # SEARCHING It will be now accessible from Spotlight box at the top right, but also on the command line. For example → mdfind "kMDItemTitle=='*livres de la BNF*'" /long/path/to/file/bnf.xhtml It worked! ps: interesting note about kMDItemKeywords and encoding. -- Karl Dubost http://www.la-grange.net/karl/
Received on Thursday, 20 June 2013 20:58:07 UTC