Saturday, August 27, 2016

Applying JDK 9 @Deprecated Enhancements

I discussed the currently proposed JDK 9 enhancements for the @Deprecated annotation in the blog post JDK 9 @Deprecated Annotation Enhancements. In this post, I look in greater detail at the recommended usage of these minor enhancements and demonstrate how key Java SE APIs are already having these @Deprecated enhancements applied.

The current version of the main JEP 277 ("Enhanced Deprecation") web page states, "The primary purpose of enhancing the @Deprecated annotation is to provide finer-grained information to tools about the deprecation status of an API." The page also describes the two new methods being added to the @Deprecated annotation [forRemoval() and since()]:

  • "A method forRemoval() returning a boolean. If true, it means that this API element is earmarked for removal in a future release. If false, the API element is deprecated, but there is currently no intention to remove it in a future release. The default value of this element is false. ... The forRemoval() boolean element, if true, indicates intent that the API element is to be removed in some future release of the project. Users of the API are thus given advance warning that, if they don't migrate away from the API, their code is liable to break when upgrading to a newer release. If forRemoval() is false, this indicates a recommendation to migrate away from the deprecated API, but without any specific intent to remove that API."
  • "A method named since() returning String. This string should contain the release or version number at which this API became deprecated. It has free-form syntax, but the release numbering should follow the same scheme as the @since Javadoc tag for the project containing the deprecated API. ... The default value of this element is the empty string."

This text makes it clear that the intention is to be able to explicitly state whether a deprecated element is likely (planned) to be removed or if there are no plans to remove the deprecated element. This can be important information for clients of that deprecated element to know with what urgency they need to change their use of the deprecated element to a different element.

The application of new JDK 9 @Deprecated methods on the Java SE API can also be instructive in how they are intended to be used. About this, the JEP 277 page currently states (my emphasis added), "Several Java SE APIs will have a @Deprecated annotation added, updated, or removed. Some proposed changes are listed below. Unless otherwise specified, the deprecations listed here are not for removal. Note that this is not a comprehensive list of deprecations in Java SE 9. Also note that several of these items will not be implemented in Java SE 9." With this overview in mind, I now turn attention to examples from the current JDK 9 API documentation to illustrate these concepts.

Not Yet Deprecated

JEP 277's web page currently lists "add @Deprecated to the Optional.get method (JDK-8160606)" as one of the Java SE APIs for which "proposed changes" apply. Because deprecating Optional.get() is currently associated with a bug (JDK-8160606), it has not been ruled out for JDK 9 even though the current Javadoc documentation doesn't show it applied yet. The next two screen snapshots demonstrate that Optional.get() is not yet deprecated in Java SE 9.

Java SE 8: Optional.get() Introduced
Java SE 9: Optional.get() Not Yet Deprecated

Deprecated With No Plans for Removal

JEP 277 includes deprecation of constructors of "boxed primitives" in its list of Java SE APIs with "proposed changes" in @Deprecation handling. The next two screen snapshots demonstrate that the JDK 9 version of Boolean does have new @Deprecated annotations applied to its constructors.

Java SE 8: Boolean Constructors Not Deprecated
Java SE 9: Boolean Constructors Deprecated since=9

It's worth noting that the newly applied @Deprecated annotations include one of the new methods (since="9"), but not the other (no specification of forRemoval(). In the case, the user of Boolean should assume, unless otherwise stated, that the Boolean constructors were deprecated since Java SE 9, but that there are no current plans to remove these deprecated constructors.

The Applet classes are treated similarly to the "boxed primitive" constructors in terms of JDK 9 @Deprecated annotation. Like the constructors of the boxed primitives classes, key applet-related classes are being newly deprecated in JDK 9, have since="9" included in the annotation to make it clear that they were annotated with Java SE 9, and don't have forRemoval() specified (meaning that false is assumed). Applet deprecation is covered by JEP 289 ("Deprecate the Applet API"), which does state, "Add the @Deprecated(since="9") annotation" to selected applet-related classes.

Applet Class @Deprecated Since JDK 9 But With No forRemoval()

Deprecated and Planned for Removal

One example of something deprecated in JDK 9 and marked for removal is System.runFinalizersOnExit(boolean). The following screen snapshots indicate that this method was already deprecated in Java SE 8, but that its deprecation in Java SE 9 also communicates that there is intent to remove this method. I also like that it communicates that this method deprecated clear back in Java 1.2.

Java SE 8: Deprecated Method with No Hint at Removal Plans or Version Originally Deprecated
Java SE 9: Deprecated Method Communication Original Deprecated Version and Removal Intent

Conclusion

JEP 277 is a highly readable treatise on the current deficiencies of @Deprecated and how the minor enhancements of JDK 9 can mitigate at least a portion of those deficiencies. Although in many ways the JDK 9 changes to @Deprecated might be described as "baby steps," they do provide a little more standardized facility for communicating a particular deprecation's history and future plans than what was available before JDK 9. The section of JEP 277 called "Usage in Java SE" is interesting in its own right because it describes several Java SE APIs (only a subset of which were highlighted in this post) that have been proposed to have their deprecation status changed or have additional details about the deprecation's history and/or future plans added to it.

Wednesday, August 17, 2016

JDK 9 @Deprecated Annotation Enhancements

In the post What Might a New @Deprecated Look Like?, I used the description of JEP 277 ("Enhanced Deprecation") at that time to guide the creation of an enhanced customized @Deprecated annotation. Since that post, however, there have been significant changes made in JEP 277. This post summarizes the changes and the currently planned enhancements to @Deprecated that are slated for JDK 9.

The changes made to JDK-8065614 ("JEP 277: Enhanced Deprecation") on 2016-03-03 18:04 remove the portion of the JEP description that described the proposed @Deprecated enum. The "Alternatives" section of the main JEP 277 page documents why the enum was removed:

Previous versions of this proposal included a variety of "reason" codes including UNSPECIFIED, DANGEROUS, OBSOLETE, SUPERSEDED, UNIMPLEMENTED, and EXPERIMENTAL. These attempted to encode the reason for which an API was deprecated, the risks of using it, and also whether a replacement API is available. In practice, all of this information is too subjective be encoded as values in an annotation. Instead, this information should be described in the Javadoc documentation comment.

The revised @Deprecated annotation now supports two methods as shown in the API documentation. The documentation explains that the forRemoval() method "indicates whether the annvaluable otated element is subject to removal in a future version" and returns false by default. The since() method documentation states that this second method "returns the version in which the annotated element became deprecated" and returns empty string by default.

The defaults of false and "" for forRemoval() and since() respectively make sense because these defaults correspond to not being able to specify this information today with @Deprecated. Because there are countless uses of @Deprecated already in code bases, it makes most sense to have these existing uses of @Deprecated correspond to not having a planned removal and to not having "since" specified. Developers will be able to add these values to existing @Deprecated annotations as they desire or not at all.

These are minor additions to the @Deprecated annotation, but the new @Deprecated is still much better than what we have today in earlier versions of Java because we will now be able to specify two very important characteristics of a deprecation within the annotation itself. Specifying when a construct was deprecated and when we plan to remove it altogether provide potentially insightful historical and future-looking information related to the deprecation.

Thursday, August 11, 2016

SPOOLing Queries with Results in psql

SQL*Plus, the Oracle database's command-line tool, provides the SPOOL command to "store query results in a file." The next screen snapshot shows SPOOL used in SQL*Plus to spool the listing of user tables to a file called C:\pdf\output.txt.

Both the executed query and the results of the query have been spooled to the file output.txt as shown in the next listing of that file.

Oracle's SQL*Plus's SPOOL-ed output.txt

SQL> select table_name from user_tables;

TABLE_NAME                                                                      
------------------------------                                                  
REGIONS                                                                         
LOCATIONS                                                                       
DEPARTMENTS                                                                     
JOBS                                                                            
EMPLOYEES                                                                       
JOB_HISTORY                                                                     
PEOPLE                                                                          
NUMERAL                                                                         
NUMBER_EXAMPLE                                                                  
COUNTRIES                                                                       

10 rows selected.

SQL> spool off

PostgreSQL's command-line tool, psql, provides functionality similar to SQL*Plus's SPOOL with the \o (\out) meta-command. The following screen snapshot shows this in action in psql.

The file output.txt written via psql's \o meta-command is shown in the next listing.

         List of relations
 Schema |  Name  | Type  |  Owner   
--------+--------+-------+----------
 public | albums | table | postgres
(1 row)

Only the results of the query run in psql are contained in the generated output.txt file. The query itself, even the longer query produced by using \set ECHO_HIDDEN on, is not contained in the output.

One approach to ensuring that the query itself is output with the query's results written to the file is to use the \qecho meta-command to explicitly write the query to the spooled file before running the query. This is demonstrated in the next screen snapshot.

Using \qecho in conjunction with \o does place the query itself in the written file with the query's results as shown in the next listed output.

select * from albums;
           title           |     artist      | year 
---------------------------+-----------------+------
 Back in Black             | AC/DC           | 1980
 Slippery When Wet         | Bon Jovi        | 1986
 Third Stage               | Boston          | 1986
 Hysteria                  | Def Leppard     | 1987
 Some Great Reward         | Depeche Mode    | 1984
 Violator                  | Depeche Mode    | 1990
 Brothers in Arms          | Dire Straits    | 1985
 Rio                       | Duran Duran     | 1982
 Hotel California          | Eagles          | 1976
 Rumours                   | Fleetwood Mac   | 1977
 Kick                      | INXS            | 1987
 Appetite for Destruction  | Guns N' Roses   | 1987
 Thriller                  | Michael Jackson | 1982
 Welcome to the Real World | Mr. Mister      | 1985
 Never Mind                | Nirvana         | 1991
 Please                    | Pet Shop Boys   | 1986
 The Dark Side of the Moon | Pink Floyd      | 1973
 Look Sharp!               | Roxette         | 1988
 Songs from the Big Chair  | Tears for Fears | 1985
 Synchronicity             | The Police      | 1983
 Into the Gap              | Thompson Twins  | 1984
 The Joshua Tree           | U2              | 1987
 1984                      | Van Halen       | 1984
(23 rows)

The main downside to use of \qecho is that it must be used before every statement to be written to the output file.

The psql variable ECHO can be set to queries to have "all SQL commands sent to the server [sent] to standard output as well." This is demonstrated in the next screen snapshot.

Unfortunately, although setting the psql variable ECHO to queries leads to the query being output along with the results in the psql window, the query is not written to the file by the \o meta-command. Instead, when \o is used with ECHO set to queries, the query itself is printed out again to the window and the results only are written to the specified file. This is because, as the documentation states (I added the emphasis), the \o meta-command writes "the query output ... to the standard output." This is demonstrated in the next screen snapshot.

I have not been able to figure out a way to easily use the \o meta-data command and have both the query and its results written to the file without needing to use \qecho. However, another approach that doesn't require \qecho is to run not try to spool the file output from within psql interactively, but to instead execute a SQL script input file externally.

For example, if I make an input file called input.sql that consisted only of a single line with query

  select * from albums;

I could run psql with the command

  psql -U postgres --echo-queries < input.txt > outputWithQuery.txt

to read that single-line file with the query and write output to the outputWithQuery.txt file. The --echo-queries option works like the \set ECHO queries from within psql and running this command successfully generates the prescribed output file with query and results. The following screen snapshot and the code listing following that demonstrate this.

outputWithQuery.txt

select * from albums;
           title           |     artist      | year 
---------------------------+-----------------+------
 Back in Black             | AC/DC           | 1980
 Slippery When Wet         | Bon Jovi        | 1986
 Third Stage               | Boston          | 1986
 Hysteria                  | Def Leppard     | 1987
 Some Great Reward         | Depeche Mode    | 1984
 Violator                  | Depeche Mode    | 1990
 Brothers in Arms          | Dire Straits    | 1985
 Rio                       | Duran Duran     | 1982
 Hotel California          | Eagles          | 1976
 Rumours                   | Fleetwood Mac   | 1977
 Kick                      | INXS            | 1987
 Appetite for Destruction  | Guns N' Roses   | 1987
 Thriller                  | Michael Jackson | 1982
 Welcome to the Real World | Mr. Mister      | 1985
 Never Mind                | Nirvana         | 1991
 Please                    | Pet Shop Boys   | 1986
 The Dark Side of the Moon | Pink Floyd      | 1973
 Look Sharp!               | Roxette         | 1988
 Songs from the Big Chair  | Tears for Fears | 1985
 Synchronicity             | The Police      | 1983
 Into the Gap              | Thompson Twins  | 1984
 The Joshua Tree           | U2              | 1987
 1984                      | Van Halen       | 1984
(23 rows)

I don't know how to exactly imitate SQL*Plus's writing of the query with its results from within SQL*Plus in psql without needing to add \qecho meta-commands, but passing the input script to psql with the --echo-queries option works very similarly to invoking and spooling the script from within SQL*Plus.

Tuesday, August 9, 2016

Remembering to Reset Thread Context Class Loader

I'm having a difficult time thinking of anything I like less about working with Java than working with class loaders. This is particularly true when working with application servers or OSGi where the use of multiple class loaders is prevalent and the ability to use class loaders transparently is reduced. I agree with the OSGI Alliance Blog post What You Should Know about Class Loaders that "in a modular environment, class loader code wreaks havoc."

Neil Bartlett has written the blog post The Dreaded Thread Context Class Loader in which he describes why the thread context class loader was introduced and why its use is not "OSGi-friendly." Bartlett states that there are rare cases in which "a library only consults the TCCL," but that in those rare cases "we are somewhat stuck" and "will have to explicitly set the TCCL from our own code before calling into the library."

Alex Miller has also written about the Thread Context Class Loader (TCCL) and points out that "Java frameworks do not follow consistent patterns for classloading" and that "many common and important frameworks DO use the thread context classloader (JMX, JAXP, JNDI, etc)." He emphasizes this, " If you are using a J2EE application server, you are almost certainly relying on code using the thread context classloader." In that post, Miller presents a dynamic proxy-based solution for helping in cases where one needs to "set the thread context classloader" and then "remember the original context classloader and re-set it."

The Knopflerfish Framework, an OSGi implementation, describes how to use the Thread Context Class Loader in the "Programming" section of its documentation. The following quote is excerpted from the "Setting the context classloader" section of Knopflerfish 5.2's "Programming" documentation:

Many external libraries, like most JNDI lookup services requires a correctly set thread context classloader. If this is not set, ClassNotFoundException, or similar might be thrown even if you have included all necessary libs. To fix this, simple spawn a new thread in the activator and do the work from that thread. ... It is not recommended to set the context class loader persistently on the startup thread, since that thread might not be unique for your bundle. Effects might vary depending on OSGi vendor. If you don't spawn a new thread, you must reset the context class loader before returning.

Knopflerish provides a simple class, org.knopflerfish.util.ClassLoaderUtil, that supports switching to a provided class loader (probably would often be the thread context class loader in an OSGi application) and ensures via finally clause that the original context class loader is reset after the operation is completed. I have implemented my own adaptation of that class that is shown in the next code listing.

ClassLoaderSwitcher.java

package dustin.examples.classloader;

/**
 * Utility class for running operations on an explicitly specified class loader.
 */
public class ClassLoaderSwitcher
{
   /**
    * Execute the specified action on the provided class loader.
    *
    * @param classLoaderToSwitchTo Class loader from which the
    *    provided action should be executed.
    * @param actionToPerformOnProvidedClassLoader Action to be
    *    performed on the provided class loader.
    * @param <T> Type of Object returned by specified action method.
    * @return Object returned by the specified action method.
    */
   public static <T> T executeActionOnSpecifiedClassLoader(
      final ClassLoader classLoaderToSwitchTo,
      final ExecutableAction<T> actionToPerformOnProvidedClassLoader)
   {
      final ClassLoader originalClassLoader = Thread.currentThread().getContextClassLoader();
      try
      {
         Thread.currentThread().setContextClassLoader(classLoaderToSwitchTo);
         return actionToPerformOnProvidedClassLoader.run();
      }
      finally
      {
         Thread.currentThread().setContextClassLoader(originalClassLoader);
      }
   }

   /**
    * Execute the specified action on the provided class loader.
    *
    * @param classLoaderToSwitchTo Class loader from which the
    *    provided action should be executed.
    * @param actionToPerformOnProvidedClassLoader Action to be
    *    performed on the provided class loader.
    * @param <T> Type of Object returned by specified action method.
    * @return Object returned by the specified action method.
    * @throws Exception Exception that might be thrown by the
    *    specified action.
    */
   public static <T> T executeActionOnSpecifiedClassLoader(
      final ClassLoader classLoaderToSwitchTo,
      final ExecutableExceptionableAction<T> actionToPerformOnProvidedClassLoader) throws Exception
   {
      final ClassLoader originalClassLoader = Thread.currentThread().getContextClassLoader();
      try
      {
         Thread.currentThread().setContextClassLoader(classLoaderToSwitchTo);
         return actionToPerformOnProvidedClassLoader.run();
      }
      finally
      {
         Thread.currentThread().setContextClassLoader(originalClassLoader);
      }
   }
}

The two methods defined on the ClassLoaderSwitcher class each take an interface as one of their parameters along with a specified class loader. The interfaces prescribe an object with a run() method and that run() method will be executed against the provided class loader. The next two code listings show the interfaces ExecutableAction and ExecutableExceptionableAction.

ExecutableAction.java

package dustin.examples.classloader;

/**
 * Encapsulates action to be executed.
 */
public interface ExecutableAction<T>
{
   /**
    * Execute the operation.
    *
    * @return Optional value returned by this operation;
    *    implementations should document what, if anything,
    *    is returned by implementations of this method.
    */
   T run();
}

ExecutableExceptionableAction.java

package dustin.examples.classloader;

/**
 * Describes action to be executed that is declared
 * to throw a checked exception.
 */
public interface ExecutableExceptionableAction<T>
{
   /**
    * Execute the operation.
    *
    * @return Optional value returned by this operation;
    *    implementations should document what, if anything,
    *    is returned by implementations of this method.
    * @throws Exception that might be possibly thrown by this
    *    operation.
    */
   T run() throws Exception;
}

Clients calling the methods defined on the ClassLoaderSwitcher class won't necessarily have fewer lines of code than they'd have doing the temporary context class loader switching themselves, but using a common class such as this one ensures that the context class loader is always changed back to the original class loader and thus removes the need for the developer to ensure the reset is available and prevents the "reset" from being inadvertently removed at some point or moved too late in the process at some point.

A client that needs to temporarily change the context class loader for an operation might do so as shown next:

Temporarily Switching ClassLoader Directly to Execute Action

final ClassLoader originalClassLoader = Thread.currentThread().getContextClassLoader();
try
{
   Thread.currentThread().setContextClassLoader(BundleActivator.class.getClassLoader());
   final String returnedClassLoaderString =
      String.valueOf(Thread.currentThread().getContextClassLoader())
}
finally
{
   Thread.currentThread().setContextClassLoader(originalClassLoader);
}

There are not that many lines of code, but one has to remember to reset the context class loader to its original class loader. Using the ClassLoaderSwitcher utility class to do the same thing is demonstrated next.

Using ClassLoaderSwitcher to Switch Class Loader to Execute Action (pre-JDK 8)

final String returnedClassLoaderString = ClassLoaderSwitcher.executeActionOnSpecifiedClassLoader(
   BundleActivator.class.getClassLoader(),
   new ExecutableAction<String>()
   {
      @Override
      public String run()
      {
         return String.valueOf(Thread.currentThread().getContextClassLoader());
      }
   });

This last example wasn't shorter than the first, but the developer did not need to worry about resetting the context class loader explicitly in the second example. Note that these two examples reference BundleActivator to get an Activator/System class loader in an OSGi application. This is what I used here, but any class that was loaded on the appropriate class loader could be used here instead of BundleActivator. Another thing to note is that my examples use a very simple operation executed on the specified classloader (returning a String representation of the current thread context class loader) that works well here because it makes it easy for me to see that the specified class loader was used. In realistic scenarios, this method could be anything one needed to run on the specified class loader.

If the method I'm invoking on the specified class loader throws a checked exception, I can use the other overloaded method (of the same name) provided by ClassLoaderSwitcher to run that method. This is demonstrated in the next code listing.

Use of ClassLoaderSwitcher with Method that Might Throw Checked Exception (pre-JDK 8)

String returnedClassLoaderString = null;
try
{
   returnedClassLoaderString = ClassLoaderSwitcher.executeActionOnSpecifiedClassLoader(
      BundleActivator.class.getClassLoader(),
      new ExecutableExceptionableAction<String>()
      {
         @Override
         public String run() throws Exception
         {
            return mightThrowException();
         }
      });
}
catch (Exception exception)
{
   System.out.println("Exception thrown while trying to run action.");
}

With JDK 8, we can make the client code more concise. The next two code listings contain methods corresponding to the methods shown in the previous two code listings, but changed to JDK 8 style.

Using ClassLoaderSwitcher to Switch Class Loader to Execute Action (JDK 8 Style)

final String returnedClassLoaderString = ClassLoaderSwitcher.executeActionOnSpecifiedClassLoader(
   urlClassLoader,
   (ExecutableAction<String>) () ->
   {
      return String.valueOf(Thread.currentThread().getContextClassLoader());
   });

Use of ClassLoaderSwitcher with Method that Might Throw Checked Exception (JDK 8 Style)

String returnedClassLoaderString = null;
try
{
   returnedClassLoaderString = ClassLoaderSwitcher.executeActionOnSpecifiedClassLoader(
      urlClassLoader,
      (ExecutableExceptionableAction<String>) () -> {
         return mightThrowException();
      });
}
catch (Exception exception)
{
   System.out.println("Exception thrown while trying to run action.");
}

The lambda expressions of JDK 8 make the client code using ClassLoaderSwitcher more concise (and arguably more readable) than directly setting and resetting the context class loader and at the same time provide greater safety by ensuring that the context class loader is always switched back to its original class loader.

Conclusion

Although it's undoubtedly best to avoid switching the context class loader as much as possible, there may be times when you have no other reasonable choice. In those times, encapsulating the multiple steps involved in the switch and switch back into a single method that can be called by clients adds safety to the operation and can even allow the client to have more concise code if written in JDK 8.

Additional References

Some of these references have already been mentioned and even highlighted in this post, but I include them again here for convenience.

Saturday, August 6, 2016

Log4j 2.x XSD is Not Fully Descriptive

In the blog post JAXB and Log4j XML Configuration Files, I discussed "nuances and subtleties associated with using JAXB to work with [Log4j 1.x and Log4j 2.x] XML configuration files via Java classes." In this post, I look at another challenge associated with generation of Log4j 2.x configuration XML via JAXB objects generated from the Log4j 2.x XML Schema file Log4j-config.xsd: it doesn't fully specify the Log4j 2.x components' configuration characteristics.

When working with Log4j 2.x XML configuration, one of the first distinctions that is important to make is which "flavor" of XML is to be used ("concise" or "strict"). The concise format may be easier because names of XML elements correspond to the Log4j 2 components they represent, but only the strict format is supported by the XSD. The implication here is that any XML marshaled from JAXB objects generated from the Log4j 2.x XSD will necessarily be of "strict" format rather than of "concise" format.

Unfortunately, the XSD (Log4j-config.xsd) currently provided with the Log4j 2.x distribution is not sufficient for generating the full "strict" XML configuration supported by Log4j 2. I demonstrate this here with discussion of the XSD-defined complex type "AppenderType" because it's one of the most extreme cases of a supported element lacking specification of its potential attributes in the XSD. The code listing below shows AppenderType's definition in the Log4j-config.xsd as of Log4j 2.6.2.

AppenderType Defined in Log4j 2.6.2's Log4j-config.xsd
<xs:complexType name="AppenderType">
   <xs:sequence>
      <xs:element name="Layout" type="LayoutType" minOccurs="0"/>
      <xs:choice minOccurs="0" maxOccurs="1">
         <xs:element name="Filters" type="FiltersType"/>
         <xs:element name="Filter" type="FilterType"/>
      </xs:choice>
   </xs:sequence>
   <xs:attribute name="type" type="xs:string" use="required"/>
   <xs:attribute name="name" type="xs:string" use="required"/>
   <xs:attribute name="fileName" type="xs:string" use="optional"/>
</xs:complexType>

The excerpt of the XSD just shown tells us that an appender described in XSD-compliant XML will only be able have one or more of the three attributes (type, name, and fileName). The "type" attribute is used to identify the type of appender it is (such as "File", "RollingFile", "Console", "Socket", and "Syslog"). The problem is that each "type" of appender has different properties and characteristics that ideally would be described by attributes on this AppenderType.

The Log4j 2.x documentation on Appenders lists characteristics of the different types of appenders. For example, this page indicates that the ConsoleAppender has seven parameters: filter, layout, follow, direct, name, ignoreExceptions, and target. The name is one of the attributes supported by the general AppenderType complex type and filter and layout are supported via nested elements in that AppenderType. The other four parameters that are available for a ConsoleAppender, however, have no mechanism prescribed in the XSD to define them.

Without even considering custom Log4j 2.x appenders, the built-in Log4j 2.x appenders don't share the same attributes and characteristics and most of them have more characteristics than the three attributes and two nested elements of the AppenderType specify. I discussed the seven parameters of a Console Appender previously and other examples include the RollingFileAppender with its twelve parameters (append, bufferedIO, bufferSize, filter, fileName, filePattern, immediateFlush, layout, name, policy, strategy, ignoreExceptions) the JDBCAppender with its seven parameters (name, ignoreExceptions, filter, bufferSize, connectionSource, tableName, columnConfigs), and the JMSAppender with its thirteen parameters (factoryBindingName, factoryName, filter, layout, name, password, providerURL, destinationBindingName, securityPrincipalName, securityCredentials, ignoreExceptions, urlPkgPrefixes, userName).

To describe each parameter available for a given appender type in the XSD would require the ability in XML Schema to write that a particular set of available attributes depends on the setting of the AppenderType's type attribute. Unfortunately, XML Schema doesn't readily support this type of conditional specification in which the available attributes of a given complex type are different based on one of the complex type's other attributes.

Because of the limitations of the schema language, a person wanting to use JAXB to generate objects with full support for all of the provided appenders would need to change the XSD. One approach would be to change the XSD so that an AppenderType had all possible attributes of any of the built-in appenders available as optional attributes for the element. The most obvious downside of this is that that XSD would then allow any appender type to have any attribute even when the attribute did not apply to a particular appender type. However, this approach would allow for the JAXB-generated objects to marshal out all XML attributes for a given appender type. The next code snippet illustrates how this might be started. Some of the additional attributes different appenders need are specified here, but even this longer list doesn't contain all of the possible appender attributes needed to support attributes of all possible built-in appender types.

Some Appender Attributes Added to AppenderType

<xs:complexType name="AppenderType">
   <xs:sequence>
      <xs:element name="Layout" type="LayoutType" minOccurs="0"/>
      <xs:choice minOccurs="0" maxOccurs="1">
         <xs:element name="Filters" type="FiltersType"/>
         <xs:element name="Filter" type="FilterType"/>
      </xs:choice>
   </xs:sequence>
   <xs:attribute name="type" type="xs:string" use="required"/>
   <xs:attribute name="name" type="xs:string" use="required"/>
   <xs:attribute name="fileName" type="xs:string" use="optional"/>
   <!-- Attributes specified below here are not in Log4j 2.x Log4j-config.xsd -->
   <xs:attribute name="target" type="xs:string" use="optional"/>
   <xs:attribute name="follow" type="xs:string" use="optional"/>
   <xs:attribute name="append" type="xs:string" use="optional"/>
   <xs:attribute name="filePattern" type="xs:string" use="optional"/>
   <xs:attribute name="host" type="xs:string" use="optional"/>
   <xs:attribute name="port" type="xs:string" use="optional"/>
   <xs:attribute name="protocol" type="xs:string" use="optional"/>
   <xs:attribute name="connectTimeoutMillis" type="xs:integer" use="optional"/>
   <xs:attribute name="reconnectionDelayMillis" type="xs:string" use="optional"/>
   <xs:attribute name="facility" type="xs:string" use="optional"/>
   <xs:attribute name="id" type="xs:string" use="optional"/>
   <xs:attribute name="enterpriseNumber" type="xs:integer" use="optional"/>
   <xs:attribute name="useMdc" type="xs:boolean" use="optional"/>
   <xs:attribute name="mdcId" type="xs:string" use="optional"/>
   <xs:attribute name="mdcPrefix" type="xs:string" use="optional"/>
   <xs:attribute name="eventPrefix" type="xs:string" use="optional"/>
   <xs:attribute name="newLine" type="xs:boolean" use="optional"/>
   <xs:attribute name="newLineEscape" type="xs:string" use="optional"/>
</xs:complexType>

A second approach to changing the Log4j 2.x XSD to support all built-in appenders fully would be to change the XSD design from having a single AppenderType whose specific type was specified by the type attribute to having many different complex types each representing the different built-in appender types. With this approach, all the attributes for any given appender and only the attributes associated with that given appender could be enforced by the XSD. This approach of having an element type per appender is similar to how the "concise" XML format works, but there is no XSD support for that currently.

Note that I have intentionally focused on the built-in appender types here because that's what a static XSD could be expected to reasonably, adequately, and completely support. Aside: this could be supported by specifying arbitrary name/value pairs for attributes as is done for filters or with parameters, but these also lead to the ability to specify extra and even nonsense attributes without any ability for the schema to catch those. A third approach that would support custom types would be to not use a static XSD for describing the grammar, but instead use a generated XSD. One could hand-write such an XSD based upon the descriptions of Log4j 2.x components in the documentation, but a better approach might be to take advantage of the @PluginFactory, @PluginElement, and @PluginAttribute annotations used in the Log4j 2.x source code. The two code listings that follow are from the Apache Log4j 2.6.2 code base and demonstrate how these annotations describe what would be the elements and attributes of given types.

ConsoleAppender.createAppender() Signature

@PluginFactory
public static ConsoleAppender createAppender(
   @PluginElement("Layout") Layout layout,
   @PluginElement("Filter") final Filter filter,
   @PluginAttribute(value = "target", defaultString = "SYSTEM_OUT") final String targetStr,
   @PluginAttribute("name") final String name,
   @PluginAttribute(value = "follow", defaultBoolean = false) final String follow,
   @PluginAttribute(value = "ignoreExceptions", defaultBoolean = true) final String ignore)

SysLogAppender.createAppender() Signature

@PluginFactory
public static SyslogAppender createAppender(
   // @formatter:off
   @PluginAttribute("host") final String host,
   @PluginAttribute(value = "port", defaultInt = 0) final int port,
   @PluginAttribute("protocol") final String protocolStr,
   @PluginElement("SSL") final SslConfiguration sslConfig,
   @PluginAttribute(value = "connectTimeoutMillis", defaultInt = 0) final int connectTimeoutMillis,
   @PluginAliases("reconnectionDelay") // deprecated
   @PluginAttribute(value = "reconnectionDelayMillis", defaultInt = 0) final int reconnectionDelayMillis,
   @PluginAttribute(value = "immediateFail", defaultBoolean = true) final boolean immediateFail,
   @PluginAttribute("name") final String name,
   @PluginAttribute(value = "immediateFlush", defaultBoolean = true) final boolean immediateFlush,
   @PluginAttribute(value = "ignoreExceptions", defaultBoolean = true) final boolean ignoreExceptions,
   @PluginAttribute(value = "facility", defaultString = "LOCAL0") final Facility facility,
   @PluginAttribute("id") final String id,
   @PluginAttribute(value = "enterpriseNumber", defaultInt = Rfc5424Layout.DEFAULT_ENTERPRISE_NUMBER) final int enterpriseNumber,
   @PluginAttribute(value = "includeMdc", defaultBoolean = true) final boolean includeMdc,
   @PluginAttribute("mdcId") final String mdcId,
   @PluginAttribute("mdcPrefix") final String mdcPrefix,
   @PluginAttribute("eventPrefix") final String eventPrefix,
   @PluginAttribute(value = "newLine", defaultBoolean = false) final boolean newLine,
   @PluginAttribute("newLineEscape") final String escapeNL,
   @PluginAttribute("appName") final String appName,
   @PluginAttribute("messageId") final String msgId,
   @PluginAttribute("mdcExcludes") final String excludes,
   @PluginAttribute("mdcIncludes") final String includes,
   @PluginAttribute("mdcRequired") final String required,
   @PluginAttribute("format") final String format,
   @PluginElement("Filter") final Filter filter,
   @PluginConfiguration final Configuration config,
   @PluginAttribute(value = "charset", defaultString = "UTF-8") final Charset charsetName,
   @PluginAttribute("exceptionPattern") final String exceptionPattern,
   @PluginElement("LoggerFields") final LoggerFields[] loggerFields, @PluginAttribute(value = "advertise", defaultBoolean = false) final boolean advertise)

This approach requires several steps because one would need to dynamically generate the XSD using knowledge of the main components of the Log4j 2.x architecture in conjunction with annotations processing and then use JAXB to generate the Java classes capable of marshaling the comprehensive Log4j 2.x XML.

Another option to be considered is to use "concise" XML or another form of Log4j 2.x configuration (such as JSON or properties files) and not use the XSD to generate JAXB objects for marshaling Log4j 2.x configuration. It's worth noting that XML configuration files used for Log4j 2.x with the "strict" format obviously do not need to validate against the Log4j-config.xsd or else the "strict" form of XML would not be able to fully specify the Log4j2 configuration. The implication of this is that the remaining value of even having the XSD is either for our own tools or scripts to use it to validate our XML configuration prior to using it with Log4j 2.x or for use in marshaling/unmarshaling Log4j 2.x XML with JAXB.

Conclusion

The Log4j-config.xsd provided with the Log4j2 distribution is not sufficient for validating all Log4j 2.x constructs in "strict" XML configuration and is likewise insufficient for generating JAXB objects to use to marshal Log4j2 strict XML. Developers wishing to use XSD for validation or JAXB class generation would need to manually change the XSD or generate one from the Log4j2 source code.

Additional References

These references were linked to inline in the post above, but are listed here for emphasis.

Wednesday, July 20, 2016

JAXB and Log4j XML Configuration Files

Both Log4j 1.x and Log4j 2.x support use of XML files to specify logging configuration. This post looks into some of the nuances and subtleties associated with using JAXB to work with these XML configuration files via Java classes. The examples in this post are based on Apache Log4j 1.2.17, Apache Log4j 2.6.2, and Java 1.8.0_73 with JAXB xjc 2.2.8-b130911.1802.

Log4j 1.x : log4j.dtd

Log4j 1.x's XML grammar is defined by a DTD instead of an W3C XML Schema. Fortunately, the JAXB implementation that comes with the JDK provides an "experimental,unsupported" option for using DTDs as the input from which Java classes are generated. The following command can be used to run the xjc command-line tool against the log4j.dtd.

    xjc -p dustin.examples.l4j1 -d src -dtd log4j.dtd

The next screen snapshot demonstrates this.

Running the command described above and demonstrated in the screen snapshot leads to Java classes being generated in a Java package in the src directory called dustin.examples.l4fj1 that allow for unmarshalling from log4j.dtd-compliant XML and for marshalling to log4j.dtd-compliant XML.

Log4j 2.x : Log4j-config.xsd

Log4j 2.x's XML configuration can be either "concise" or "strict" and I need to use "strict" in this post because that is the form that uses a grammar defined by the W3C XML Schema file Log4j-config.xsd and I need a schema to generate Java classes with JAXB. The following command can be run against this XML Schema to generate Java classes representing Log4j2 strict XML.

    xjc -p dustin.examples.l4j2 -d src Log4j-config.xsd -b l4j2.jxb

Running the above command leads to Java classes being generated in a Java package in the src directory called dustin.examples.l4j2 that allow for unmarshalling from Log4j-config.xsd-compliant XML and for marshalling to Log4j-config.xsd-compliant XML.

In the previous example, I included a JAXB binding file with the option -b followed by the name of the binding file (-b l4j2.jxb). This binding was needed to avoid an error that prevented xjc from generated Log4j 2.x-compliant Java classes with the error message, "Property "Value" is already defined. Use <jaxb:property> to resolve this conflict." This issue and how to resolve it are discussed in A Brit in Bermuda's post Property "Value" is already defined. Use to resolve this conflict. The source for the JAXB binding file I used here is shown next.

l4j2.jxb

<jxb:bindings version="2.0"
              xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
              xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   <jxb:bindings schemaLocation="Log4j-config.xsd" node="/xsd:schema">
      <jxb:bindings node="//xsd:complexType[@name='KeyValuePairType']">
         <jxb:bindings node=".//xsd:attribute[@name='value']">
            <jxb:property name="pairValue"/>
         </jxb:bindings>
      </jxb:bindings>
   </jxb:bindings>
</jxb:bindings>

The JAXB binding file just shown allows xjc to successfully parse the XSD and generate the Java classes. The one small price to pay (besides writing and referencing the binding file) is that the "value" attribute of the KeyValuePairType will need to be accessed in the Java class as a field named pairValue instead of value.

Unmarshalling Log4j 1.x XML

A potential use case for working with JAXB-generated classes for Log4j 1.x's log4j.dtd and Log4j 2.x's Log-config.xsd is conversion of Log4j 1.x XML configuration files to Log4j 2.x "strict" XML configuration files. In this situation, one would need to unmarshall Log4j 1.x log4j.dtd-compliant XML and marshall Log4j 2.x Log4j-config.xsd-compliant XML.

The following code listing demonstrates how the Log4j 1.x XML might be unmarshalled using the previously generated JAXB classes.

   /**
    * Extract the contents of the Log4j 1.x XML configuration file
    * with the provided path/name.
    *
    * @param log4j1XmlFileName Path/name of Log4j 1.x XML config file.
    * @return Contents of Log4j 1.x configuration file.
    * @throws RuntimeException Thrown if exception occurs that prevents
    *    extracting contents from XML with provided name.
    */
   public Log4JConfiguration readLog4j1Config(final String log4j1XmlFileName)
      throws RuntimeException
   {
      Log4JConfiguration config;
      try
      {
         final File inputFile = new File(log4j1XmlFileName);
         if (!inputFile.isFile())
         {
            throw new RuntimeException(log4j1XmlFileName + " is NOT a parseable file.");
         }

         final SAXParserFactory spf = SAXParserFactory.newInstance();
         final SAXParser sp = spf.newSAXParser();
         final XMLReader xr = sp.getXMLReader();
         
         final JAXBContext jaxbContext = JAXBContext.newInstance("dustin.examples.l4j1");
         final Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
         final UnmarshallerHandler unmarshallerHandler = unmarshaller.getUnmarshallerHandler();
         xr.setContentHandler(unmarshallerHandler);

         final FileInputStream xmlStream = new FileInputStream(log4j1XmlFileName);
         final InputSource xmlSource = new InputSource(xmlStream);
         xr.parse(xmlSource);

         final Object unmarshalledObject = unmarshallerHandler.getResult();
         config = (Log4JConfiguration) unmarshalledObject;
      }
      catch (JAXBException | ParserConfigurationException | SAXException | IOException exception)
      {
         throw new RuntimeException(
            "Unable to read from file " + log4j1XmlFileName + " - " + exception,
            exception);
      }
      return config;
   }

Unmarshalling this Log4j 1.x XML was a bit trickier than some XML unmarshalling because of the nature of log4j.dtd's namespace treatment. This approach for dealing with this wrinkle is described in Gik's Jaxb UnMarshall without namespace and in Deepa S's How to instruct JAXB to ignore Namespaces. Using this approach helped avoid the error message:

UnmarshalException: unexpected element (uri:"http://jakarta.apache.org/log4j/", local:"configuration"). Expected elements ...

To unmarshall the Log4j 1.x that in my case references log4j.dtd on the filesystem, I needed to provide a special Java system property to the Java launcher when running this code with Java 8. Specifically, I needed to specify
     -Djavax.xml.accessExternalDTD=all
to avoid the error message, "Failed to read external DTD because 'file' access is not allowed due to restriction set by the accessExternalDTD property." Additional details on this can be found at NetBeans's FaqWSDLExternalSchema Wiki page.

Marshalling Log4j 2.x XML

Marshalling Log4j 2.x XML using the JAXB-generated Java classes is fairly straightforward as demonstrated in the following example code:

   /**
    * Write Log4j 2.x "strict" XML configuration to file with
    * provided name based on provided content.
    *
    * @param log4j2Configuration Content to be written to Log4j 2.x
    *    XML configuration file.
    * @param log4j2XmlFile File to which Log4j 2.x "strict" XML
    *    configuration should be written.
    */
   public void writeStrictLog4j2Config(
      final ConfigurationType log4j2Configuration,
      final String log4j2XmlFile)
   {
      try (final OutputStream os = new FileOutputStream(log4j2XmlFile))
      {
         final JAXBContext jc = JAXBContext.newInstance("dustin.examples.l4j2");
         final Marshaller marshaller = jc.createMarshaller();
         marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
         marshaller.marshal(new ObjectFactory().createConfiguration(log4j2Configuration), os);
      }
      catch (JAXBException | IOException exception)
      {
         throw new RuntimeException(
            "Unable to write Log4 2.x XML configuration - " + exception,
            exception);
      }
   }

There is one subtlety in this marshalling case that may not be obvious in the just-shown code listing. The classes that JAXB's xjc generated from the Log4j-config.xsd lack any class with @XmlRootElement. The JAXB classes that were generated from the Log4j 1.x log4j.dtd did include classes with this @XmlRootElement annotation. Because the Log4j 2.x Log4j-config.xsd-based Java classes don't have this annotation, the following error occurs when trying to marshal the ConfigurationType instance directly:

MarshalException - with linked exception: [com.sun.istack.internal.SAXException2: unable to marshal type "dustin.examples.l4j2.ConfigurationType" as an element because it is missing an @XmlRootElement annotation]

To avoid this error, I instead (line 18 of above code listing) marshalled the result of invoking new ObjectFactory().createConfiguration(ConfigurationType) on the passed-in ConfigurationType instance and it is now successfully marshalled.

Conclusion

JAXB can be used to generate Java classes from Log4j 1.x's log4j.dtd and from Log4j 2.x's Log4j-config.xsd, but there are some subtleties and nuances associated with this process to successfully generate these Java classes and to use the generated Java classes to marshal and unmarshal XML.

Friday, July 15, 2016

Apache PDFBox Command-line Tools: No Java Coding Required

In the blog post Apache PDFBox 2, I demonstrated use of Apache PDFBox 2 as a library called from within Java code to manipulate PDFs. It turns out that Apache PDFBox 2 also provides command-line tools that can be used directly from the command-line as-is with no additional Java coding required. There are several command-line tools available and I will demonstrate some of them in this post.

The PDFBox command-line tools are executed by taking advantage of PDFBox's executable JAR (java -jar with Main-Class: org.apache.pdfbox.tools.PDFBox). This is the JAR with "app" in its name and, for this particular blog post, is pdfbox-app-2.0.2.jar. The general format used to invoke these tools in java -jar pdfbox-app-2.0.2.jar <Command> [options] [files].

When the executable JAR is executed without arguments, a form of help is provided that lists the available commands. This is shown in the next screen snapshot.

This screen snapshot shows that this version of Apache PDFBox (2.0.2) advertises support for the "Possible commands" of ConvertColorspace, Decrypt, Encrypt, ExtractText, ExtractImages, OverlayPDF, PrintPDF, PDFDebugger, PDFMerger, PDFReader, PDFSplit, PDFToImage, TextToPDF, and WriteDecodedDoc.

Extracting Text: "ExtractText"

The first command-line tool I am looking at is extracting text from a PDF. I demonstrated using PDFBox to do this from Java code in my previous blog post. Here, I will use PDFBox to do the same thing directly from the command-line with no Java source code in sight. The following operation extracts the text from the PDF Scala by Example. In my previous, post the Java code accessed this PDF online and used PDFBox to extract text from it. In this case, I've downloaded the Scala by Example and am running the PDFBox ExtractText command-line tool against that downloaded PDF stored on my hard drive at C:\pdf\ScalaByExample.pdf.

The command to extract text from the PDF from the command-line using PDFBox is: java -jar pdfbox-app-2.0.2.jar ExtractText C:\pdf\ScalaByExample.pdf. The next two screen snapshots demonstrate running this command and the file it generates. From these screen snapshots, we can see that the text file generated by this command by default has the same name as the source PDF but with a .txt extension. This command supports multiple options including the ability to specify the name of the text file by placing that name after the source PDF's file name and the ability to write the text to the console instead of to a file via the -console flag (from which the output could be redirected). Examples of how to specify a custom text file name and how to direct text to console instead of file are shown next.

  • Explicitly Specifying Text File Name:
    • java -jar pdfbox-app-2.0.2.jar ExtractText C:\pdf\ScalaByExample.pdf C:\pdf\dustin.txt
  • Rendering Text on Console
    • java -jar pdfbox-app-2.0.2.jar ExtractText -console C:\pdf\ScalaByExample.pdf

PDF from Text: "TextToPDF"

When it is desirable to go the other way (start with text as the source and generate a PDF), the command TextToPDF is appropriate. To demonstrate this, I'm using a source text file called doi.txt that contains a portion of the United States Declaration of Independence:

The unanimous Declaration of the thirteen united States of America,

When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness

With a sample text file in place at C:\pdf\doi.txt, PDFBox's TextToPDF can be run against it. The command, java -jar pdfbox-app-2.0.2.jar TextToPDF C:\pdf\doi.pdf C:\pdf\doi.txt (note that the target PDF is listed as the first argument and the source text file in listed as the second argument). The next three screen snapshots demonstrate running this command the successful generation of a PDF from the source text file.

Extracting Images from PDFs: "ExtractImages"

The PDFBox command-line tool ExtractImages makes it as easy to extract images from a PDF as the command-line tool "ExtractText" made it to extract text from a PDF. My demonstration of this capability will extract four images from a PDF I created with images from the Black Hills (and surrounding area) of South Dakota that is called BlackHillsSouthDakotaAndSurroundingSights.pdf. A screen snapshot of this PDF is shown next.

PDFBox can be used to extract the four photographs in this PDF with the command java -jar pdfbox-app-2.0.2.jar ExtractImages C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf as demonstrated in the next screen snapshot.

Running this command as shown in the last screen snapshot extracts the four images from the PDF. Each extracted image is named after the source PDF with a hyphen and counting integer appended to the end of the name. The generated images are also JPEG files with .jpg extensions. In this case, the names of the generated files are thus BlackHillsSouthDakotaAndSurroundingSights-1.jpg, BlackHillsSouthDakotaAndSurroundingSights-2.jpg, BlackHillsSouthDakotaAndSurroundingSights-3.jpg, and BlackHillsSouthDakotaAndSurroundingSights-4.jpg and each is displayed next in the form extracted directly from the PDF.

BlackHillsSouthDakotaAndSurroundingSights-1.jpg BlackHillsSouthDakotaAndSurroundingSights-2.jpg
BlackHillsSouthDakotaAndSurroundingSights-3.jpg BlackHillsSouthDakotaAndSurroundingSights-4.jpg

Encrypting PDF: "Encrypt"

Apache PDFBox makes it easy to encrypt a PDF. For example, I can encrypt the PDF used in the "ExtractImages" example with the following command: java -jar pdfbox-app-2.0.2.jar Encrypt -O DustinWasHere -U DustinWasHere C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf as shown in the next screen snapshot:

Once I've run the encrypt command, I need a password to open this PDF in Adobe Reader:

Decrypting PDF: "Decrypt"

It's just as easy to decrypt this PDF with the command java -jar pdfbox-app-2.0.2.jar Decrypt -password DustinWasHere C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf as shown in the next screen snapshot. The image demonstrates that an InvalidPasswordException is thrown when no password is provided (or the wrong password is provided) for decrypting the PDF and then it shows a successful decryption and I'm once again able to open the PDF in Adobe Reader without password.

Merging PDFs: "PDFMerger"

PDFBox allows multiple PDFs to be merged into a single PDF with the "PDFMerger" command. This is demonstrated in the next screen snapshots by merging the two single-page PDFs mentioned earlier (doi.pdf and BlackHillsSouthDakotaAndSurroundingSights.pdf into a new PDF called third.pdf with the command java -jar pdfbox-app-2.0.2.jar PDFMerger C:\pdf\doi.pdf C:\pdf\BlackHillsSouthDakotaAndSurroundingSights.pdf C:\pdf\third.pdf.

Splitting PDFs: "PDFSplit"

I can split the third.pdf PDF just created with PDFMerger with the command PDFSplit. This is a particularly simple case because the PDF being split is only two pages. The command is demonstrated with the next screen snapshots.

The snapshots demonstrate that the PDFs split out of third.pdf are called third-1.pdf and third-2.pdf.

Conclusion

In this post, I showed several of the command-line utilities available out-of-the-box with no Java coding required. There are a few other command-line utilities available that were not demonstrated here. All of these commands are easily used by running the executable "app" JAR provided with a PDFBox distribution. As command-line utilities, these tools enjoy the advantages of command-line tools including being quick to run and able to be included within scripts and other automated tools. Another benefit of these tools is that, because they are implemented in open source, developers can use the source code for these tools to see how to use the PDFBox APIs in their own applications and tools. Apache PDFBox's command-line tools are freely available and easy-to-use PDF manipulation tools that can be used with no extra Java code being written.