NSXMLParser memory use

written by mfazekas on May 24th, 2009 @ 10:33 AM

Cocoa on MacOSX provides 2 XML parsers: NSXMLDocument for tree based parsing and NSXMLParser for event driven (or streaming) parsing.

iPhone has only NSXMLParser. One would expect this (event based) parser to have minimal memory footprint. So I was very upset when i had out of memory issues parsing a 8 MB XML document. This article describes the issues with NSXMLParser and shows some workarounds/alternatives.

NSXMLParser in theory

A quote from Apple documentation on NSXMLParser :

Event-driven parsing—because it deals with only one XML construct at a time and not all of them at once—consumes much less memory than tree-based parsing. It is ideal for situations where performance is a goal and modification of the parsed XML is not.

This is basically Dom vs SAX. Streaming/SAX parsers can (and should) work with minimal memory as they don’t have to keep the whole XML document in memory.

NSXMLParser in practice

But let’s take a look what’s going on with NSXMLParser in a real word application. The following object alloc output was captured with a program downloading an XML document then parsing it with NSXMLParser initWithData:.

The xml was 8 megabyte. As you can see NSXMLParser initWithData: allocated an extra 22 MB to parse that 8MB document. On iPhone 22MB is pretty significant. But it just got worse during the parsing it NSXMLParser keeped allocating memory, until the program reached 54MB total usage and was killed by the OS before the parsing finished. My estimate is that for this 8MB document NSXMLParser used at least 37MB of extra memory.

AQXMLParser to rescue

AQXMLParser is an NSXMLParser replacement. And just by replacing the word NSXMLParser with AQXMLParser in the code, the memory use got much better. It could finish the parse of the document with a peak memory use of 19 MB. The extra memory used by AQXMLParser was less than 1 MB.

Summary: issues with NSXMLParser

  1. it is not a real streaming parser: initWithURL: will download the full xml before processing it. For memory use this is bad as it have to allocate the memory for the full xml wich can’t be reclaimed until the end of parse. For performance it’s also bad, as you cannot interleave the IO intensive part of downloading and CPU intensive part of parsing.
  2. it provides no other forms of input than NSData and NSURL. For example if you send your authentication information in the NSURLRequest body then your only option is to download the data yourself and use initWIthData:
  3. it will not release memory. It seems that strings/dictionaries created during the parsing is kept around until the end of parse. I’ve tried to improve it with creative use of NSAutoreleasePool but without any success.
  4. it is slow. XMLPerformance demo shows that NSXMLParser is up to 2x slower than parsing with libxml2. This is mostly the overhead of creating Objective-C strings and dictionaries.

Workardounds/alternatives

AQXMLParser is a drop in replacement that solves all but the performance overhead of Objective-C interface:

  1. it is a streaming parser, it starts parsing as soon as the first chunks arrives, and will not download the full document.
  2. input can be NSInputStream and NSURLRequest in addition to NSURL and NSData.
  3. it will release the strings/dicitonaries created by it as soon as possible.

If you need maximum performance you can use libxml2 as demonstrated by apple in XMLPerformance. But it means you have to rewrite your parsing code.

Getting AQXMLParser

To use it add AQXMLParser.h, AQXMLParser.m to your project. Add libxml2 to your headers/libraries, and add CFNetwork framework to your poject. See the readme for details.

Related links

Tracing Objective-C messages

written by mfazekas on May 3rd, 2009 @ 10:22 PM

Tracing is a great debug tool. If you are working on not so well documented/known part of the system looking into inside it can be a big help. As Objective-C is very dynamic it has quite good tracing capabilities. It provides out of the box tools to dump the messages sent. In this artice i’ll show how you can log all the messages sent inside the system, and i’ll also show you can log the messages sent to a particular object.

Tracing of all messages with NSObjCMessageLoggingEnabled

The definite source of debugging on MacOS is tn2124 – The OSX Debugging Magic . Of course it mentious the tracing facility of the objc runtime:

 
If you set the NSObjCMessageLoggingEnabled environment variable to "YES", 
the Objective-C runtime will log all dispatched Objective-C messages to a 
file named /tmp/msgSends-<pid>.

The only issue with this kind of tracing that it creates too much information which is hard enought to analize.

Turning tracing on/off – instrumentObjcMessageSends

Fortunately there is the undocumented function: instrumentObjcMessageSends. It lets you turn on/off tracing progamatically for only parts of the program.

As it’s an undeclared method, to use it you have to declare before you can call it:
FOUNDATION_EXPORT void instrumentObjcMessageSends(BOOL enable);
- (void)enableTrace:(BOOL)enable
{
  instrumentObjcMessageSends(enable);
}

Filtering by class or methods

You can even go further and overwrite the logging callback used by this facility. It can be usefull for limiting messages for certain classes, or redirecting the output to elsewhere.

Tracing messages sent to a particular object

However if you want to log messages sent to a specific instance, you can use an other technique: objective-c message forwarding.

This class is meant to be used by creating a proxy with the initWithOriginal: then using the created instance in place of the original. Note that this it’s use is not 100% transparent, and breaks in many situations, i still find an usefull debugging tool.

Related links:

The world smallest Mach-O executable!

written by mfazekas on April 4th, 2009 @ 06:44 PM

Amit Singh the author of the excelent MacOSX Internals: A Systems Aproach posted an article about creating a minimal Mach (OSX) executable – Crafting a Tiny Mach-O Executable In the article he demonstrated a Mach 165 byte executable returning the exit code ‘42’. As he noted there are some zeros in his program, so we should be able to compress it further. This article is my attempt to reduce the executable size a bit.

Strategy

This is the 165 byte executable from Amit article:

; tiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o tiny tiny.asm

BITS 32
  org   0x1000

  db    0xce, 0xfa, 0xed, 0xfe       ; magic
  dd    7                            ; cputype (CPU_TYPE_X86)
  dd    3                            ; cpusubtype (CPU_SUBTYPE_I386_ALL)
  dd    2                            ; filetype (MH_EXECUTE)
  dd    2                            ; ncmds
  dd    _start - _cmds               ; cmdsize
  dd    0                            ; flags
_cmds:
  dd    1                            ; cmd (LC_SEGMENT)
  dd    44                           ; cmdsize
  db    "__TEXT"                     ; segname
  db    0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
  dd    0x1000                       ; vmaddr
  dd    0x1000                       ; vmsize
  dd    0                            ; fileoff
  dd    filesize                     ; filesize
  dd    7                            ; maxprot

  dd    5                            ; cmd (LC_UNIXTHREAD)
  dd    80                           ; cmdsize
  dd    1                            ; flvaor (i386_THREAD_STATE)
  dd    16                           ; count (i386_THREAD_STATE_COUNT)
  dd    0, 0, 0, 0, 0, 0, 0, 0       ; state
  dd    0, 0, _start, 0, 0, 0, 0, 0  ; state
_start:
  xor   eax,eax
  inc   eax
  push  byte 42
  sub   esp, 4
  int   0x80                         ; _exit(42)

filesize equ  $ - $$

So we have a mach_header with 2 segements, then the executable code at _start. The total size is 165 bytes. The header with the 2 load commands takes 152 bytes and the code takes only 13 bytes. If we can embed the code into the load commands (there are a lot of zeroes), the we’ll be able to save this 13 bytes.

Test rig

I’m using the following shell script as a test framework. It compiles, executes and tests the executable:

#!/bin/sh
nasm -f bin -o tiny tiny.asm
ls -las  ./tiny
./tiny
if [ $? == 42 ] ; then
  echo "OK" 
else
  echo "Fail" 
fi

The compressed code:

So the plan is to place the 13 bytes of code to one of unused serion of zeroes in the load commands. There are 2 series of zeroes we can reuse. One is at LC_SEGMENT after the __TEXT. We need a single zero for correct null terminated c-string but others can be reused – this is 9 bytes space. We have a longer series of zero-s at LC_UNIXTHREAD. These zeros represent the initial states of the registers, and it’s safe to change them to non zero. We have 40 + 20 bytes of free space here.

The simple solution would be to place the 13 bytes code into the 40 or 20 bytes place. But that would be too simple. Instead i’ll place first 4 bytes of code + jmp taking 5 bytes to the LC_SEGMENT after the null terminated string __TEXT, and i’ll put the remainder of code 9 bytes to the zeroes to the thread state.

So here is the code for the 152 bytes executable:


; tiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o tiny tiny.asm

BITS 32
    org   0x1000

    db    0xce, 0xfa, 0xed, 0xfe       ; magic
    dd    7                            ; cputype (CPU_TYPE_X86)
    dd    3                            ; cpusubtype (CPU_SUBTYPE_I386_ALL)
    dd    2                            ; filetype (MH_EXECUTE)
    dd    2                            ; ncmds
    dd    _cmdend - _cmds              ; cmdsize
    dd    0                            ; flags
_cmds:
    dd    1                            ; cmd (LC_SEGMENT)
    dd    44                           ; cmdsize
    db    "__TEXT"                     ; segname
    db    0                            ; segname

_start0:                               ; first part of the code
    xor   eax,eax                      ; 2 bytes
    push  byte 42                      ; 2 bytes
    jmp   _start2                      ; 5 bytes

    dd    0x1000                       ; vmaddr
    dd    0x1000                       ; vmsize
    dd    0                            ; fileoff
    dd    filesize                     ; filesize
    dd    7                            ; maxprot

    dd    5                            ; cmd (LC_UNIXTHREAD)
    dd    80                           ; cmdsize
    dd    1                            ; flvaor (i386_THREAD_STATE)
    dd    16                           ; count (i386_THREAD_STATE_COUNT)

_start2:                               ; second part
    inc   eax                          ; 1 bytes
    sub   esp, 4                       ; 6 bytes
    int   0x80                         ; 2 bytes

    db    0, 0, 0
    dd    0, 0, 0, 0, 0                 ; state 
    dd    0, 0, _start0, 0, 0, 0, 0, 0  ; state
_cmdend:

filesize equ $ - $$

Is it possible to compress the size further?

An even smaller Mach executable would require smaller load commands. But just reducing the size and cmdsize of LC_UNIXTHREAD caused “Malformed Mach-o file”, so the trivial way doesn’t works.

Leave if you have an idea for further space optimizations!

Hiding the "Take Picture" overlay

written by mfazekas on March 31st, 2009 @ 02:12 PM

One of the annoying thing with the default UIImagePickerController is the “Take Picture” overlay. This article is about removing that useless piece of UI.

Rant: The iphone screen is small, i don’t really understand the role of that huge “Take Picture” caption. It’s hard to imagine any case where the user needs this information.

Finding the UIView to hide

Obviously that element is an UIView and once we find it it should be easy to hide it. But how do we find it? One way to do it is dumping the view hierarchy with dimensions and then location the view in question based on it’s size/position.

The following code can be used to dump the view hiearachy with the dimensions:

- (void)dumpViews:(NSString*)pre view:(UIView*) view
{
    NSLog(@"%@View:%@ [%.f,%.f,%.f,%.f]\n",pre,view,
        view.frame.origin.x,view.frame.origin.y,
        view.frame.size.width,view.frame.size.height);
    for(UIView *sview in view.subviews) {
        [self dumpViews:[NSString stringWithFormat:@"  %@",pre] 
               view:sview];
    }
}

We’ll call it like [self dumpViews:@"" view:rootView];. We only need to find out the rootView, and the right time to call it. We need to call it after the elements are created but before they are displayed.

Finding a container view

Once the image picker view is visible, we should be able to use window as the rootView, but again we want to remove it before it’s displayed. We can try using UIImagePickerController view property, but that will be empty just after creation. So we need to find the right point – after the creation, but before the display.

The solution is to realize that UIImagePickerController is an UINavigationController. Navigation controllers will notify delegates before it displays a new view controller via: - (void)navigationController:willShowViewController:animated:(BOOL)animated.

So when we implement the delegate method like this:

- (void)navigationController:(UINavigationController *)nav 
  willShowViewController:(UIViewController *)viewController 
  animated:(BOOL)animated
{
    [self dumpViews:@"" view:viewController.view];
}

We get the following output:

navigationController:<UIImagePickerController>
willShowViewController:<PLCameraViewController>
 View:<UIView: 0x5296a0> [0,0,320,480]
   View:<PLCameraView: 0x52ec40> [0,0,320,480]
     View:<UIImageView: 0x532380> [10000,10000,320,480]
     View:<UIView: 0x5372e0> [0,480,320,0]
     View:<PLCropOverlay: 0x529ce0> [0,0,320,480]
       View:<UIImageView: 0x528540> [0,20,320,96]
       View:<PLCropLCDLayer: 0x5285d0> [0,20,320,96]
       View:<TPBottomDualButtonBar: 0x535290> [0,384,320,96]
         View:<TPPushButton: 0x535450> [22,26,128,47]
         View:<TPCameraPushButton: 0x537900> [170,26,128,47]
           View:<UIImageView: 0x5372b0> [51,12,26,19]

From the dimensions it’s easy to see that what we want to hide is: UIImageView and PLCropLCDLayer:

       View:<UIImageView: 0x528540> [0,20,320,96]
       View:<PLCropLCDLayer: 0x5285d0> [0,20,320,96]

Hiding the overlay

The challenge with hiding the “Take Picture” overlay is that we want to make sure we’re only hiding the right elements. We don’t want to get into a situation where we’re hiding the buttons instead of the overlay.

The following code makes sure that we’re hiding the correct elements by checking their dimensions:


- (void)hideTakePhotoCaption:(UIViewController*)viewController
{
    if (viewController.view.subviews.count > 0) {
        UIView* cameraView = [viewController.view.subviews 
                                            objectAtIndex:0];
        if (cameraView.subviews.count > 2) {
            UIView* cropOverlay = [cameraView.subviews lastObject];
            if (cropOverlay.subviews.count > 1) {

                UIView* imageView = [cropOverlay.subviews 
                                    objectAtIndex:0];
                UIView* view2 = [cropOverlay.subviews objectAtIndex:1];

                if (imageView.frame.size.height == 96.0 && 
                    view2.frame.size.height == 96.0) {
                    imageView.hidden = YES;
                    view2.hidden = YES;
                }
            }
        }
    }
}
- (void)navigationController:(UINavigationController *)nav 
  willShowViewController:(UIViewController *)viewController 
  animated:(BOOL)animated
{
    [self hideTakePhotoCaption:viewController];
}

We could add even more checks by making sure the items we hide has the right type (class), but that is left as an excercise for the reader.

Adding your UI elements to the camera screen.

Note that the techinque described above is not limited to hiding the overlay. You can add your own overlay as well.

Download sample code

Related links:

Hello world!

written by mfazekas on February 8th, 2009 @ 08:33 PM

Hello world! Technorati Profile