AdSense Mobile Ad

Tuesday, September 17, 2013

C/C++ Project Built with GNU Build System (A.K.A. GNU Autotools): NetBeans vs. Eclipse CDT

Recently, I had to work on a C++ project built with the GNU Build System and decided to use the NetBeans IDE. I already use NetBeans for many Java EE projects and since it claims support for the GNU Build System it was natural for me to import the project there.

I encountered some issues due to the fact that the GNU Build System I was using was deployed in my home directory (OS X (1.8) ships old components which I could not use). To try to solve this issue, I decided to try Eclipse CDT.

Neither tool solved that problem, but I had the opportunity of working side-by-side on the same project with two widely used IDEs I never used for serious C++ programming before. After a couple of months I had a very clear idea about what each IDE offered me in this specific use case and made my final decision. This post is not a detailed comparison of all their features: first of all, because I believe that the "Eclipse vs. NetBeans" debate, even though in the C/C++ realm, is more a matter of taste rather than a matter of functionality. Secondly, because there is plenty of information on the internet. What I could not find, and that is the reason I am writing this blog post, is how these tools compare when the GNU Build System is part of the equation.

Both IDEs Claim Support for the GNU Build System

As stated in the introduction, both IDEs claim support for the GNU Build System and in fact I could build my project in less than 10 minutes in either IDE without previous knowledge. However, setting up my workspace with either tool was a very different experience.

Setting up the project in NetBeans was so easy that at first I thought it ignored the build system altogether. NetBeans imported the sources, configured my OS toolchain, detected the presence of autoconf and automake files, configured the project running configure and built it straight away.

Setting up the project in Eclipse CDT was easy, but not as straightforward as it was with NetBeans, the main difference being Eclipse CDT requiring manual intervention (a click, basically) to setup the GNU Build System in the imported project. On the bright side, as we will see, Eclipse CDT is much more configurable than NetBeans and offers you clean ways to configure how the build system is invoked.

Importing the Sources into a NetBeans Project

Importing the project sources into a NetBeans project is straightforward. First of all, uncompress your source tar-ball somewhere (assuming you created it with make dist). Then, choose File/New Project and select the C/C++ Project with Existing Sources in the New Project dialog, as shown in the next picture.

Create New C++ Project

In the next dialog, just choose the path of your sources and an appropriate tool collection (this project is a C++11 project, so that CLang was chosen on OS X). If you leave the configuration mode to automatic NetBeans will check for the existence of either a make file or a configuration script and will setup your project to configure and build in the its root directory. If you prefer building it in a separate directory or have multiple build configurations, then choose Custom and fine tune your project.

Configure C++ Project

NetBeans will create a new project using your sources, will configure it and make it:

Running configure

Running make

From now on, until you modify your configuration files, you can just build the project in NetBeans and make will be invoked on the generated makefile.

If you need to reconfigure the project, you can select Project/More Build Commands/Reconfigure Project. A configuration dialog will pop up where you can specify additional configure parameters:

Configure parameters

As you have seen, NetBeans offers a very easy way to import a project built with the GNU Build System with almost no user interaction. The basic functionality (configure and make) is there, hidden behind a very thin and intuitive UI layer.

Importing the Sources into an Eclipse CDT Project

Creating a working Eclipse CDT project from the same tar-ball is easy, but not as easy as it is with NetBeans.  First of all, choose File/Import/Existing Code as Makefile Project:

Import Existing Code


In the next dialog, pretty much as in the NetBeans case, the only required user input is the source path.

Configure C++ Project

When Eclipse imports the project it does not detect the presence of GNU Build System configuration files. To configure the build system, the user must select File/New/Convert to a C/C++ Autotools Project (a very bad naming choice, because I would not expect to find this feature in the File/New submenu):

Convert C++ Project to Autotools Project

Once the project has been converted to an Autotools project, Eclipse will run configure and you can start working on it.

Running configure

Since only CLang can properly compile this C++11 project on OS X 1.8, I need to reconfigure the project (basically, to have Eclipse add CXX=clang++ when invoking configure).

Autotools configuration is where Eclipse CDT shines. Autotools settings can be found in the project settings:

Configuring the Build System

As you can see, this is more than what NetBeans offers. Common configure parameters are hierarchically organised in a tree where they can be set with handy UI controls:

Configuring the Compiler

Once the compiler has been set, the project built correctly:

Running make

Comparison

I've been using both environments for a couple of months, switching between one and the other without experiencing any major issue, even working on different platforms, alternating the use of OS X and Linux

According to my experience, I think both IDEs offer a solid working environment featuring a nice integration of the GNU Build System into the UI. On the usability side, I think NetBeans' UI is cleaner, more intuitive and easier to use. I think this is true in general, and it's one of the reasons I usually stick with NetBeans for Java SE and Java EE development.

On the other hand, Eclipse CDT offers much finer control over the build system. As we have seen, the build settings lets you tweak many configure parameters from a handy UI and they can be saved in configurations.

The Eclipse CDT project settings supposedly let you specify alternate paths for the GNU Build System tools:

Configuring the Build System Paths

This feature would be very handy, since it's not uncommon to install updated versions of the tools in an alternate location. Unfortunately, I was unable to have it working properly when the alternate versions are not on the user path (which is when I'd use this customisation in the first place). Even though the correct version is invoked by Eclipse, the "wrong" version (the one in the path) when a tool invokes another one (as in the case of autoreconf). Which seems reasonable, since the GNU Build System are shell scripts. In fact, when I first saw this dialog, I wondered how it could work in my case, in which I've got a local autoconf and automake installation in my home directory.

Considering the IDE as a whole, I still prefer NetBeans to Eclipse CDT. C/C++ support is integrated and the UI is easier and intuitive. Eclipse CDT is certainly more configurable and offers features which NetBeans lacks, such as support for configure.ac and Makefile.am files which NetBeans treats as plain text files. Despite these gaps, though, I believe NetBeans wins on the UI usability side.

Code Formatting

Another of my major concerns with NetBeans for C/C++ was the lack of proper, configurable code formatting tools. For a while I even tried using GNU indent to fill this gap but the workflow was disruptive and prone to error since indent support for C++ is still experimental. Fortunately, this gap has been filled and now you can instruct NetBeans to format your C/C++ code using the most commonly used coding styles, such as:
  • Apache.
  • BSD (ANSI and OpenSolaris).
  • GNU.
  • K&R.
  • Linux Kernel.
  • MySQL.
  • NetBeans.
  • Whitesmiths.

NetBeans Code Formatter

The list of supported styles is in fact longer than what Eclipse CDT currently offers:
  • BSD/Allman.
  • GNU.
  • K&R.
  • Linux.
  • Whitesmiths.

Eclipse Code Formatter

On the other hand, if you need to create a customised style from scratch, Eclipse CDT is still superior as far as customisation options are concerned.

Both code formatters work pretty well and I found only few quirks. Both IDEs offer good tools and I think that nowadays they're equally usable.

Tool Collection Management

NetBeans manages the tool collections of a platform in a clean way, separating the configuration of a tool collection from the configuration of a project. This way, you can change define a new tool collection, or choose another between the available ones in your system, and use it to build a project as a simple drop-in replacement of the original. This is especially handy not only in the case you're testing the build with different tool chains, but also in the case you're building the same project on different platforms.

In the following pictures, you can see the definition of the two tool collections which are available by default on OS X (1.8): clang and gcc.

CLang Tool Collection

GNU Tool Collection

The project I've been working on had to build on both OS X and Linux. When using NetBeans, I can just open the project on either platform and choose the appropriate tool collection. In fact, each time I switched from OS X to Linux, NetBeans detected an invalid tool (CLang++ was not available) and offered to reconfigure the project.

In the following picture you can see how a project can be configured to use a tool collection. Once the tool collection is chosen, NetBeans will automatically reconfigure the project.

Project Configuration

When using Eclipse CDT, on the other hand, the C++ compiler is selected using (or removing altogether) the CXX flag. There's no automatic or easy way to reconfigure the project as NetBeans does and while it's certainly technically easy to change the compiler variable, it's a usability problem. On the usability side, NetBeans clearly wins.

Conclusion

An IDE is a very important tool in the life of a programmer, and choosing an IDE is a very delicate process which may have huge impact on your performance. On the one hand, I often try different IDEs to choose the best one for a specific use case. On the other hand, becoming proficient on an IDE is a process which requires time and, depending on the situation (project scheduling is a tyrant), I feel it's better for me to stick with a well known IDE and be (very) productive from the beginning, rather than switching to a brand new one because it offers features than the former lacks.

If you are already a NetBeans or Eclipse user, stick with it. Most of the times you'll be fine and will not need anything else. Maybe you'll find a solid reason to switch to another IDE, and when that time comes, the best thing is try more than one and decide yourself.

That's what I did. I'm a long time NetBeans user and started the project with it. I found some minor issues and used those problems as an opportunity to investigate other IDEs, such as Eclipse CDT. While working on NetBeans is familiar and comfortable to me, I recognise I was very willing to check out the latest Eclipse CDT (Kepler at the time of writing) and see whether I should reconsider my IDE choice.

At the end I decided to stay with NetBeans because of its usability features, especially the tool collection management, even though Eclipse CDT offers a much finer control over the build system configuration: but being able to easily switch OS and/or tool collection as NetBeans does was the deal-breaker for me.

Thursday, January 10, 2013

Backups Using Amazon S3 and Glacier: A Clarification

Some days ago I published a post about the recent integration of Glacier as a new storage class in Amazon S3 and how it pages the way to new and interesting use cases, even for home users, despite being a service more geared towards enterprise users. The post was then kindly cited by Ted Forbes on the latest instalment (at the time of writing) of his excellent photography podcast, The Art Of Photography: Episode 118, Photo Storage with Amazon Glacier and S3. The podcast has surely driven a great deal of visits to my blog post, and I've received lots of emails with questions related to it.

Many of them asked whether I ever did, or why the post did not, consider other more user-friendly backup solutions in the cloud. In fact, most of these comments focussed on completely different kind of services, with a particular emphasis on services which enable easy and automatic backups of both entire computers, drives, or folders.

Now, it was never my intention to go into details of that kind of offering, and I won't do it know. But I do think that a followup to the original post is necessary to clarify a couple of things.

First of all, I want to stress the relevance of a fundamental assumption that I took for granted when choosing S3 and Glacier as a cold storage service for some files of mine: I want to offload files from my disks, assuming I'm done working with them and won't almost certainly need to access them in the medium term (if not in the foreseeable future). Ted made a great work in his podcast episode in explaining how Amazon S3 and Glacier can be used and in suggesting some interesting use cases. Ted certainly did a better job than I did in the original blog post in suggesting that Glacier is an interesting option to offload big files we don't use often to a reliable and affordable cloud storage service.

In fact, in my current workflow there's no room (nor will) for other kinds of strategies than offloading from my workstations, and I suspect many users out there have got similar workflows and issues (I guess photographers do). Some kind of content is very "bulky":  photographs and video footage can easily reach the tens of gigabytes per work session, if not more, and even an amateur photographer like me can easily overgrow its hard disk, no matter how big it is. Of course, I've always kept on expanding my disk pools at home to satisfy the always increasing need of space, but I'm certainly not willing to maintain unnecessary files on the internal hard disks of my machine beyond the amount of time strictly necessary to work on them. Once I'm done with them, I either back them up in my home storage appliance (if I foresee the need to have them quickly available) or I offload them.

That's the use case Glacier is great for! I'm not asking for anything more, nor anything less, than an affordable and reliable site to store them until I'll need them, should it ever happen.

To make a long story short, I agree there are lots of alternatives out there, each of them with its own features, strengths and shortcomings, and different level of complexity. Google Drive, for example, is just great to keep a relatively small amount of content organised and synchronised across a wide range of devices. CrashPlan offerings for home users are a great way to start easily backing up entire computers and drives. Zoolz have got a similar offerings, with distinct online and cold storage tiers.

Nevertheless, what I really don't like about some of this services is the fact that they sometimes charge depending on the number of users and/or computers you're backing up. I'm using many different devices and, because of my workflow, they're all still pretty easy to setup and contain pretty much the same data: I just keep locally the applications I need and the data I'm working on. Everything else is not kept in my the internal hard drives. This approach is very convenient because I never worry about the loss of a machine: I just need to install the OS and the applications which, of course, I always keep available. As an OS X user I don't even use Time Machine, because it's quicker (much quicker) to just reinstall the OS and the apps I need. Let alone synchronising tens of gigabytes over the internet. For me it's just non sense, I just need to work fast and to recover fast. But I recognise it's certainly appealing to lot of other users with different needs.

For that reason, in my workflow I really don't need nor want any client synchronising anything on the wire. I just load a bunch of data I'm working on on my workstations (a photo session, for example), back it up locally elsewhere (as you should always do with assets you need and cannot lose) and, when I'm finished with it, I offload it somewhere else and delete it from my drives.

That somewhere is currently Amazon S3 and Amazon Glacier: it's affordable, it's easy to use and no matter how many devices I'm working on, I can always grab my data if I need it.

Sunday, December 30, 2012

Amazon S3 and Glacier: A Cheap Solution for Long Term Storage Needs

In the last few years, lots of cloud-based storage services began providing relatively cheap solutions to many classes of storage needs. Many of them, especially consumer-oriented ones such as DropBox, Google Drive and Microsoft SkyDrive, try to appeal their users with free tiers and collaborative and social features. Google Drive is a clear case of this trend, having "absorbed" many of the features of the well-known Google Docs applications, seamlessly integrating them into easy to use applications for many platforms, both mobile and desktop-oriented.

I've been using these services for a long time now, and despite being really happy with them, I've been looking for alternative solutions for other kinds of storage needs. As an amateur photographer, for example, I generate a lot of files on a monthly basis, and my long-term storage need for backup is currently in the tens of gigabytes per month. If I used Google Drive to satisfy those needs, supposing I'm already in the terabyte range, I'd pay almost $50 per month! Competitors don't offer seriously cheaper solutions either. At that price, one could argue that a decent home-based storage solution could be a better solution to his problems.

The Backup Problem

The problem is that many consumer cloud storage services are not really meant for backup, and you're paying for a service which keeps your files always online. On the other hand, typical backup strategies involve storing files in mediums which are kept offline, typically reducing the total cost of the solution. At home, you could store your files in DVDs, and keep hard disk space available for other tasks. Instead of DVDs, you could use hard drives as well. We're not considering management issues here (DVDs and hard drives can fail over time, even if kept off and properly stored) but the important thing to grasp here is that different storage needs can be satisfied by different kind of storage classes, to minimize the long-term storage costs of assets whose size is most probably only going to grow over time.

This kind of issues has been addressed by Amazon, which recently rolled out a new service for low-cost long-term storage needs: Amazon Glacier.

What Glacier Is Not

As soon as Glacier was announced, there has been a lot of talking about it. At a cost of $0.01 per gigabyte per month, it clearly seemed an affordable solution for this kind of problems. The cost of one terabyte would be $10 per month, 5 times cheaper than Google Drive, 10 times cheaper than DropBox (at the time of writing).

But Glacier is a different kind of beast. For starters, Glacier requires you to keep track of a Glacier-generated document identifier every time you upload a new file. Basically, it acts like a gigantic database where you store your files and retrieve them by key. No fancy user interface, no typical file system hierarchies such as folders to organize your content.

Glacier's design philosophy is great for system integrators and enterprise applications using the Glacier API to meet their storage needs, but it certainly keeps the average user away from it.

Glacier Can Be Used as a New Storage Class in S3

Even if Glacier was meant and rolled out with enterprise users in mind, at the time of release the Glacier documentation already stated that Glacier would be seamlessly integrated with S3 in the near future.

S3 is a cloud storage web service which pioneered the cloud storage offerings, and it's as easy to use as any other consumer-oriented cloud storage service. In fact, if you're not willing to use the good S3 web interface, lots of S3 clients for almost every platform exist. Many of them even let you mount an S3 bucket as if it were an hard disk.

In the past, the downside of S3 for backup scenarios has always been its price, which was much higher than that of its competitors: 1 terabyte costs approximately $95 per month (for standard redundancy storage).

The great news is that now that Glacier has been integrated with S3, you can have the best of both worlds:
  • You can use S3 as your primary user interface to manage your storage. This means that you can keep on using your favourite S3 clients to manage the service.
  • You can configure S3 to transparently move content to Glacier using lifecycle policies.
  • You will pay Glacier's fees for content that's been moved to Glacier.
  • The integration is completely transparent and seamless: you won't need to perform any other kind of operation, your content will be transitioned to Glacier according to your rules and it will always be visible into your S3 bucket.

The only important thing to keep in mind is that files hosted on Glacier are kept offline and can be downloaded only if you request a "restore" job. A restore job can take up to 5 hours to be executed, but that's certainly acceptable in a non-critical backup/restore scenario.

How To Configure S3 and Use the Glacier Storage Class

The Glacier storage class cannot be used directly when uploading files to S3. Instead, transitions to Glacier are managed by a bucket's lifecycle rules. If you select one of your S3 buckets, you can use the Lifecycle properties to configure seamless file transitions to Glacier:

S3 Bucket Lifecycle Properties

In the previous image you can see a lifecycle rule of a bucket of mine, which move content to Glacier according to the rules I defined. You can create as many rules as you need and rules can contain both transitions and expirations. In this use case, we're interested in transitions:

S3 Lifecycle Rule - Transition to Glacier

As you can see in the previous image, the afore-mentioned S3 lifecycle rule instructs S3 to migrate all content from the images/ folder to Glacier after just 1 day (the minimum amount of time you can select). All files uploaded into the images directory will automatically be transitioned to glacier by S3.

As previously stated, the integration is transparent and you'll keep on seeing your content into your S3 bucket even after it's been transitioned to Glacier:

S3 Bucket Showing Glacier Content

Requesting a Restore Job

The seamless integration between the two services don't finish here. Glacier files are kept offline and if you try to download them you'll get an error instructing you to initiate a restore job.

You can initiate a restore job from within the S3 user interface using a new Action menu item:

S3 Actions Menu - Initiate Restore

When you initiate a restore job for part of your content (of course you can select only the files you need), you can specify the amount of time the content will be kept online, before being automatically migrated to Glacier again:

S3 Initiation a Restore Job on Glacier Content

This is great since you won't need to remember to transition content to Glacier again: you simply ask S3 to bring your content online for the specified amount of time.

Conclusions

This post quickly outlines the benefit of storing a backup copy of your important content on Amazon Glacier, taking advantage of the ease of use and the affordable price of this service. Glacier integration in S3 enables any kind of users to take advantage of it without even changing your existing S3 workflow. And if you're new to S3, it's just as easy to use as any other cloud storage service out there. Maybe their applications are not as fancy as Google's, but their offer is unmatched today, and there are lots of easy to use S3 clients, either free or commercial (such as Cyberduck and Transmit if you're a Mac user), or even browser based S3 clients such as plugins for Firefox and Google Chrome.

Everybody has got files to backup, and many people is unfortunately unaware of the intrinsic fragility of typical home-based backup strategies, let alone users that never perform any kind of backups. Hard disks fail, that's just a fact, you just don't know when it's going to happen. And besides hard disk failures, other problems may appear over time such as undetected data corruption, which can only be addressed using dedicated storage technologies (such as the ZFS file system), all of which are usually out of range of many user, either for their cost or for their skill requirements for setup and management.

In the last 6 years, I've been running a dedicated Solaris server for my storage needs, and I bought at least 10 hard drives. When I projected the total cost of ownership of this solution I realised how Glacier would allow me to spare a big amount of money. And it did.

Of course I'm still keeping a local copy of everything because I sometimes require quick access to it, but I reduced the redundancy of my disk pools to the bare minimum, and still have a good night sleep because I know that whatever happens my data is still safe at Amazon premises. If a disk breaks (it happened a few days ago), I'm not worried about array reconstruction, because it's not an issue any longer, and I just use two-way mirrors instead of more costly solutions. I could even give up using a mirror altogether, but I'm not willing to reconstruct the content from Glacier every time a disk fails (and it's going to happen at least once every 2/3 years, according to my personal statistics).

So far, I never needed to restore anything from Glacier, but I'm sure that day will eventually come. And I want to be prepared. And you should want to as well.

P.S.: Ted Forbes has cited this blog post in Episode 118 (Photo Storage with Amazon Glacier and S3) of The Art of Photography, his excellent podcast about photography. If you still don't know it, you should check it out. Ted is an amazing guy and his podcast is awesome, with content that ranges from tips, techniques and interesting digressions on the art of photography. I've learnt a lot from him and I bet you will, too.

Wednesday, November 21, 2012

Creating a Compliant PDF for a Blurb Book with TeX

I'm an avid Blurb user since I receive my first photo book. They're just awesome. But I'm not going to talk about photos this time.

Today, I received my first text book, 440 pages, trade format (6x9 inches), printed in black and white and with hardcover and dust jacket. I opened it and it's just awesome. As awesome as a custom-made text book can be, at least for that price (approximately $33). The book is heavy, seems sturdy, doesn't look cheap, the text is finely printed and the paper is very good, with a nice creamy colour and nice to the touch.

I just tried Blurb's text and photo books, and they really live up to the expectations. If you're wondering whether you should try them: you definitely should. Furthermore, Blurb has recently expanded their offerings including magazines, brochures, planners and other products. I haven't tried them yet (and don't know if I will), but should I need one of them, I wouldn't hesitate to purchase them from Blurb.

The Downside of Using TeX with Blurb

TeX (and TeX-derived processors) are wonderful, provided you know how to use them. And if you have some scientific academic background, chances are you do. I've been a faithful TeX user since the 90's, and I never switched to other editors, not even for pure text books. TeX's advantages are manifold, but suffice it to say that the quality of the documents you can produce is high, much higher (from a typesetting standpoint) than what you can achieve with a basic desktop publishing software (such as Microsoft Word). And TeX is it's really WYSIWYG and portable.

Blurb provides many tools to easily create books and upload them to their site. Most of them take care of producing a Blurb-compliant book so that users have not to worry about that. Adobe Lightroom 4, for example, has a very flexible Book module that mimics built upon Blurb's BookSmart. Here you can find the update list of available tools for editors.

What's the problem with TeX? The problem is that the output of a TeX engine depends on the engine itself. For "generic" books, Blurb has got a PDF To Book feature that allows users to upload a Blurb-compliant PDF for print. And what's "Blurb-compliancy"? Blurb obviously checks for a lot of things about the PDF in order to make sure the book is printed as the client wants it to be. Basically, the pages in the PDF files must conform to the sizes specified by Blurb's PDF To Book Specification. which, on the other hand, obviously depend on your book geometry. For example, the number of pages and the paper type determine the height of a book's spine, and you've got to take that into account when designing your book cover. This reasoning applies to all parameters of a book geometry (such as page size, cropped size, bleed, margins and safe boundaries).

To complicate things even further, Blurb requires the PDF to be a compliant PDF/X-3 file. Needless to say, many PDF export tools (such as OS X save to PDF or pdfTeX) don't guarantee such compliance.

When it comes to TeX, then, apart from setting up your page geometry according to Blurb specifications (which is a problem which affects every publishing tool you may use), you've required to produce a compliant PDF/X-3 file, otherwise the Blurb preflight checks will fail.

The purpose of this blog post is describing how I solved this problem and help TeX/LaTeX users out there easily produce a Blurb-compliant PDF file.

My workflow usually is:
  • Typeset the book in TeX with approximate book specifications: you don't really know them until the exact number of pages is known, so don't even bother nailing them at the beginning.
  • Once the number of pages is known, I fine tune the book specs.
  • Once the book is ready for print, I manage to produce a compliant PDF/X-3 file.

Book Specifications

This is the easy part, and that's one TeX is really great for. On the other hand, since TeX is so flexible, how you achieve it depends on parameters such as the TeX engine and the packages you're using.

When editing text books, I usually rely on LaTeX and the memoir document class. Why? Because it encapsulate common functionality for book editing provided by lots of other packages (without dealing with each of them), it's extremely well documented and it's easy to use, especially when it comes to stock and page size tuning. If you use other engines or classes, however, don't worry: rely on their documentation to discover how to tune the page size according to Blurb specs.

Furthermore, as we'll see later, pdfTeX (and company) doesn't produce PDF/X-3 file, and you've got to rely on some tricks to produce a proper file. Maybe you should check other TeX engines (such as the excellent ConTeXt) which now supports PDF/X out of the box. Getting a Blurb-compliant PDF with ConTeXt is a no-brainer. But you might lose the possibility of using other packages you're used to (such as the memoir class, which I really like). If you're beginning your book, I suggest you take a look at ConTeXt and see whether it fits your needs.

Here's an example of a book setup in case you decide to use LaTeX and the memoir class. Let's suppose your book has got 390 pages, hardcover with dust jacket, black and white printing. According to Blurb, the book specifications are (in points):
  • Final PDF should measure: 441x666
  • Page size / trim line: 432x648
  • Bleed: 9
  • Inset for margins (top, bottom, outer): 18
  • Inset for margins (binding): 45 

This is a screenshot of the Blurb calculator:

Blurb Calculator - Specifications for a book

To tune your memoir document, you can use this code (I prefer to use inches in the book):

\setstocksize{9.25in}{6.125in}
\settrimmedsize{9in}{6in}{*}
\settrims{0.125in}{0.0625in}
\settypeblocksize{*}{\lxvchars}{1.618}
\setlrmargins{*}{*}{0.618}
\setulmargins{*}{*}{1}
\setbinding{0.625in}
\checkandfixthelayout

Please note that the settypeblocksize is not required to get a compliant book, it's just a suggestion to get a beautifully proportioned typeblock size.

Change the dimensions accordingly (remember that 1 inch corresponds to 72 point) to match your book dimensions using Blurb calculator. If you need more insights on the features provided by the memoir class, check its documentation.

Producing a PDF/X-3 Compliant File

This is the tricky part, and I was lucky enough to find a proven solution on the Internet (according to a discussion thread in TeX - StackExchange). In fact, I worried about this only when the book was typeset, and I started worrying whether I should switch to ConTeXt (and losing part of the typesetting work) or purchasing Adobe Distiller. Fortunately, there was a solution.

To have pdfTeX produce a compliant file, you should add this fragment just before the \begin{document} statement and tune it to your needs:

\pdfinfo{
/Title (Your Book Title)    % set your title here
/Author (The Author)        % set author name
/Subject (The Subject)      % set subject
/Keywords (The Keyword Set) % set keywords
/Trapped (False)
/GTS_PDFXVersion (PDF/X-3:2002)
}
% I think Blurb ignores both the MediaBox and the TrimBox, but I put it anyway
\pdfpageattr{
/MediaBox [0 0 441.00000 666.00000]
/TrimBox [0.00000 9.00000 432.00000 657.00000]}
\pdfminorversion=3
\pdfcatalog{
/OutputIntents [ <<
/Info (none)
/Type /OutputIntent
/S /GTS_PDFX
/OutputConditionIdentifier (Blurb.com)
/RegistryName (http://www.color.org/)
>> ]
}

The MediaBox and the TrimBox are the important parts, which establish the page geometry. As you can see, the MediaBox is set to the Blurb final PDF specification (beware that the box coordinates are swapped compared to the output of the Blurb calculator).

The TrimBox is a little trickier to "casual users". Since the bleed is 9, I set the first of the trim box corners to (0pt, 9pt) (that's why you'd better get point measures from Blurb) and the second corner was set to (432pt, 657pt). Why? 432pt is the trimmed size width, and you can leave it as is. Since the bleed affects the bottom and the top of the page, you can subtract it from the media box: 666 - 9 = 657. Assuming your text won't run so close to the trim lines.

Run your input file through pdfTeX and you should get a compliant PDF/X-3 file. The compliancy can be checked with the free Adobe Reader application. First of all, you should see the yellow trim box you specified on your PDF pages and, most importantly, you should get the following two entries into the document properties (both required by PDF/X-3):
  • PDF Version: 1.3 (Acrobat 4.x).
  • GTS_PDFXVersion: PDF/X-3:2002

The former is found in the main document properties tab (called Description), the latter into the Custom tab, as shown in the following picture:

Adobe Reader - Document Properties - Custom

What About the Cover?

The same reasoning applies. However, even if I consider myself an hardcore TeX user, I just rely on Adobe InDesign and Blurb's InDesign Plugin to get a good cover. You can certainly argue you can design it with TeX, and it's true, but when I'm dealing with images in a single page cover, I just prefer to use InDesign, especially since Adobe lets us rent it for approximately $30 bucks a month. I try to accumulate some jobs and process them on a single month.

Sometimes it's not possible and it certainly is a deal breaker for many, I do realise it. Anyway, as I already said, there are many other options, just check Blurb's website.

Conclusion

As you can see, producing a TeX book for printing with Blurb should be easy enough for any TeX user. The results are wonderful, both because TeX is a great engine and because Blurb is delivering awesome quality products at reasonable prices.

I hope this helps a lot of TeX users out there.

Monday, September 3, 2012

Architexa Review: Understand and Document Java Code Bases within Eclipse

People at Architexa have just released its Architexa tool for free to individuals and teams up to 3 members. Using their own slogan, Architexa is a tool "to understand and document Java code bases within Eclipse". At first it seems yet-another tool to generate UML diagrams from an existing code base (and it certainly is) but it has got other interesting features that differentiates it from its competitors. In fact, such tools have been around for a long time and some of them even enjoyed good adoption rates (think about Rational Rose).

Architexa, however, seems to focus on a niche: doing things fast and collaboratively, and I must acknowledge they're pretty close to achieving that goal. In fact, Architexa is not a fully fledged UML diagramming tool such as Rational Rose was or other plugins still are. Rather, it offers interesting features and possibilities using UML as the main UI. In my opinion, it's an important point to take into account when reviewing it.

The only catch is Architexa being an Eclipse plugin. Sure, there are lots of Eclipse users out there and giving priority to Eclipse is a sound choice on their part. However, there are lots of another-IDE-kind-of-guy out there (I'm mainly a NetBeans kind-of-guy) and developers working on corporations may not have the freedom to install their IDE of choice.

Installing Architexa

Installing Architexa is as straightforward as installing an Eclipse plugin. Just add the software repository URL you get during signup to Eclipse and install the plugin. The installation procedure is simple and almost unattended: the only question you'll be asked is trusting a certificate. Once the plugin is installed, just restart your Eclipse instance and you're ready to go.

Using Architexa

Architexa user interface is pretty straightforward:
  • A couple of menu items (both in the main menu and in contextual menus) to open one of the available diagrams.
  • Some menu items to manage Architexa indexes.
  • Some menu items to access Architexa documentation right inside your Eclipse IDE.

Three types of diagrams are currently available:
  • Layered Diagram
  • Class Diagram
  • Sequence Diagram

Depending on the context you invoke a particular item, Architexa will take you to the chosen element in the corresponding diagram.

Since I suspect this kind of tool is most useful to developers working on big projects whose entire code base they've got no complete knowledge of, I decided to review it importing a subset of an EJB module of an application I worked on in an empty Eclipse project and start from there.

Layered Diagram

The Layered Diagram looks like a package diagram at first, but there's much more to it than this. It's a very useful representation of your project architecture which, as every Architexa diagram, can be easily configured to contain just the level of detail you need. This way, you can get a quick overview of module dependencies into your project and you can progressively "drill down" in order to discover more detailed relationship between packages and between package contents (classes and interfaces) and other elements.

Here's how a layered diagram looks like:

Layered Diagram


As you can see, the diagram is divided into layers and when you hover a component with your mouse dependencies are shown as arrows of different sizes. By default, objects in an upper layer are dependent on objects on a lower layer.

In the previous diagram, the interceptors packages was collapsed, and you can expand it just double clicking on it. Furthermore, in the following picture you can see the dependencies of an element (AuditableEntityListener) just hovering the mouse pointer over it:

Layered Diagram - Discover the Dependencies of an Item

Architexa UI is great because it relies on intuitive concepts to convey information to the user. The size of the elements in this diagram are proportional to the "quantity" of information which is present in the project. This way, you can get important information with a quick glance, such as:
  • Which big packages and classes are.
  • Where a great number of dependencies are concentrated.
  • Whether your architecture contains cycles.

By default, the level of detail of the diagram is minimum but the user can expand it "on demand". In the following picture you can see how you can add additional information about an element just hovering over it and using the dependencies arrows to add additional levels to the type hierarchy:

Layered Diagram - Add Levels to the Type Hierarchy

Layered Diagram - Add Levels to the Type Hierarchy

This diagram offers a pretty restricted palette you can use to add insightful detail to your diagram in order to use it as a good documentation tool. The palette currently includes elements such as actors, databases and user comments.

Class Diagram

At first sight, a class diagram behaves more or less like a class diagram generated in any other UML diagramming tool you may have tried. However, it's based on Architexa's design philosophy we saw in the previous section: reduce the clutter and get the job done. When you generate a class diagram from a class, you're just presented something like this:

Class Diagram

No doubt that's just the bare minimum you need to know about a class. Once again, you can have Architexa add the information you're interested in using its user interface. First of all, at the class level, you can add referencing types and methods. Then, you can add other class information (such as methods, interfaces, etc.) using the menu that appears when hovering the class and selecting the items you want to show:

Class Diagram - Select Class Information

As you can see, you can filter methods by visibility and items by type (interface, class, methods and fields). If you want to add them all, just use the Add All button:

Class Diagram - Method Information

You can further refine the information of a class item using its contextual menu shown in the next picture:

Class Diagram - Class Item Contextual Menu

As you can see, you can add a wealth of information, such as called and calling methods, referenced and referencing methods and declaring class.

Class Diagram - Types Referenced by a Method

Finally, there's also a quick way to add called method information. When you hover a method, an arrow control is shown:

Class Diagram - Add Called Methods

and if you click it you'll be presented a dialog where you can choose the called methods you want to add:

Class Diagram - Called Methods Selection Dialog

The dialog can be used to filter methods by visibility and there are two buttons that allows you to add either all called methods or the callee hierarchy. The resulting diagram looks like this:

Class Diagram - Called Methods

Obviously, this process can be applied to any element added to the diagram until you've added all the information you want to show.

Sequence Diagram

The sequence diagram provided by Architexa is very similar to diagrams generated by similar tools. Once again, however, Architexa philosophy is reducing the clutter and letting the user decide what he wants to be shown in it.

An initial sequence diagram looks like this:

Sequence Diagram

Hovering over the class lets you choose members to be shown into the diagram, using the method selection window we've already described in previous sections:

Sequence Diagram - Method Selection Window

Once a method is chosen, the diagram is updated:

Sequence Diagram - Selected Method

Hovering on other diagram elements let you add depth to the call hierarchy, selecting more and more levels to be shown. Adding methods called by the persist method results in the following diagram:

Sequence Diagram - Methods Called by the persist Method

Collaborative Features


Architexa provides basic collaborative features and lets you share diagrams with other people. The Architexa main menu contains the following items:

Collaborative Features

As you can see, you can get diagrams from a server or share your own. When you share a diagram, you're given two choices: sharing using a server (which acts as a central repository) or sharing by email (which simply attaches the diagram to a newly created email message):

Sharing a Diagram

Another way you can share a diagram is presented when you save a newly created diagram:

Saving a Diagram

Architexa lets you save a diagram as either:
  • A file in the local disk.
  • A shared diagram in a private server.
  • A shared diagram in a community server.

Conclusions

Architexa is not the typical UML diagramming tool in that it's built with a different design philosophy. Instead of "just" producing diagrams out of an existing code base, it lets the user customise the diagrams and decide the details that must be included in an easy and intuitive way. This fact fulfils the Architexa's slogan promises: it's a great tool to create diagrams that "make sense", according to each user's needs.

If you haven't tried it, it may seems just a "nuance", but it's a great usability leap for such a tool in the right direction. I've been an user of UML modelling tools for years, and I grew more and more dubious about the alleged improvements in developers' productivity. Most of the times, if not all, I ended up always relying on my textual IDE to navigate through the code base, jumping from method to method as needed. This time, however, I feel that Architexa can fill a gap and can really be useful to a developer, not only in the documentation stage, at least in a handful of use cases. Architexa UI is very efficient and pretty intuitive and during the tests I performed I felt very "proficient" at jumping from a dependency to another or from a method call to another.

But all that glitters is not gold, and Architexa has got its own shortcomings. First of all: it's an Eclipse plugin and it's not available for other IDEs. This is a deal-breaker for many users, such as I, who are not willing to switch their IDEs.

Then, some important Java language features are missing, such as generics. Generic aren't new kids on the block (two major Java releases have seen the light after support for generics was added to the language). They can't be dismissed as something of little importance, either. I don't know why no information about generic types and signatures appear on Architexa diagrams, but I hope this gap will be filled soon.

Then, I'd really like to see "awareness" about more Java technologies built into the tool. When I started reviewing it, I decided to use a fragment of an EJB module to see if there were more bells and whistles than what I was reading on other reviews which used simple Java projects. Given Architexa's design philosophy, I'd really like to see more information about classes, at least in the Layered Diagram. Furthermore, since many recent Java EE technologies heavily relies on runtime-available annotations, such as EJB and JPA, I added entities and EJBs to the project to see whether annotations were discoverable information: unfortunately, they were not.

Architexa is a good tool and I think it's in the right track to catch on developers and rise adoption from the bottom up. Developers can take good advantage of it and I believe it's a critical aspect for such a tool to gain adopters. Instead of being a tool imposed to them for methodology's sake, it's a tool that adds real value and can get their job done more easily and more effectively. Furthermore, Architexa is now free for individuals and small teams (up to 3 members) so that everyone can sign-up, download it and start to use it on its own, real life projects.