Tuesday, October 6, 2015

Single Quotes in Oracle Database Index Column Specification

In my previous post, I mentioned that a downside of using double quotes to explicitly specify case in Oracle identifiers is the potential for being confused with the use of single quotes for string literals. Although I don't personally think this is sufficient reason to avoid use of double quotes for identifiers in Oracle, it is worth being aware of this potential confusion. When to use single quotes versus when to use double quotes has been a source of confusion for users new to databases that distinguish between the two for some time. In this post, I look at an example of how accidental misuse of single quote where no quote is more appropriate can lead to the creation of an unnecessary index.

The SQL in the simple script createPersonTable.sql generates a table called PEOPLE and an index will be implicitly created for this table's primary key ID column. However, the script also contains an explicit index creation statement that, at first sight, might appear to also create an index on this primary key column.

   id number PRIMARY KEY,
   last_name varchar2(100),
   first_name varchar2(100)

CREATE INDEX people_pk_index ON people('id');

We might expect the statement that appears to explicitly create the primary key column index to fail because that column is already indexed. As the output below shows, it does not fail.

When a query is run against the indexes, it becomes apparent why the explicit index creation did not fail. It did not fail because it was not creating another index on the same column. The single quotes around what appears to be the "id" column name actually make that 'id' a string literal rather than a column name and the index that is created is a function-based index rather than a column index. This is shown in the query contained in the next screen snapshot.

The index with name PEOPLE_PK_INDEX was the one explicitly created in the script and is a function-based index. The implicitly created primary key column index has a system-generated name. In this example, the function-based index is a useless index that provides no value.

It's interesting to see what happens when I attempt to explicitly create the index on the column by using double quotes with "id" and "ID". The first, "id", fails ("invalid identifier") because Oracle case folds the name 'id' in the table creation to uppercase 'ID' implicitly. The second, "ID", fails ("such column list already indexed") because, in this attempt, I finally am trying to create an index on the same column for which an index was already implicitly created.

In my original example, the passing of a literal string as the "column" to the index creation statement resulted in it being created as a useless function-based index. It could have been worse if my intended primary key column index hadn't already been implicitly created because then I might not have the index I thought I had. This, of course, could happen when creating an index for a column or list of columns that won't have indexes created for them implicitly. There is no error message to warn us that the single-quoted string is being treated as a string literal rather than as a column name.


The general rule of thumb to remember when working with quotation marks in Oracle database is that double quotes are for identifiers (such as column names and table names) and single quotes are for string literals. As this post has demonstrated, there are times when one may be misused in place of the other and lead to unexpected results without necessarily displaying an error message.

Monday, October 5, 2015

Downsides of Mixed Identifiers When Porting Between Oracle and PostgreSQL Databases

Both the Oracle database and the PostgreSQL database use the presence or absence of double quotes to indicate case sensitive or case insensitive identifiers. Each of these databases allows identifiers to be named without quotes (generally case insensitive) or with double quotes (case sensitive). This blog post discusses some of the potential negative consequences of mixing quoted (or delimited) identifiers and case-insenstive identifiers in an Oracle or PostgreSQL database and then trying to port SQL to the other database.

Advantages of Case-Sensitive Quoted/Delimiter Identifiers

There are multiple advantages of case sensitive identifiers. Some of the advertised (real and perceived) benefits of case sensitive database identifiers include:

  • Ability to use reserved words, key words, and special symbols not available to identifiers without quotes.
    • PostgreSQL's keywords:
      • reserved ("only real key words" that "are never allowed as identifiers")
      • unreserved ("special meaning in particular contexts," but "can be used as identifiers in other contexts").
      • "Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.) This allows constructing table or column names that would otherwise not be possible, such as ones containing spaces or ampersands."
    • Oracle reserved words and keywords:
      • Oracle SQL Reserved Words that can only be used as "quoted identifiers, although this is not recommended."
      • Oracle SQL Keywords "are not reserved," but using these keywords as names can lead to "SQL statements [that] may be more difficult to read and may lead to unpredictable results."
      • "Nonquoted identifiers must begin with an alphabetic character from your database character set. Quoted identifiers can begin with any character."
      • "Quoted identifiers can contain any characters and punctuations marks as well as spaces."
  • Ability to use the same characters for two different identifiers with case being the differentiation feature.
  • Avoid dependency on a database's implementation's case assumptions and provide "one universal version."
  • Explicit case specification avoids issues with case assumptions that might be changeable in some databases such as SQL Server.
  • Consistency with most programming languages and operating systems' file systems.
  • Specified in SQL specification and explicitly spells out case of identifiers rather than relying on specific implementation details (case folding) of particular database.
  • Additional protection in cases where external users are allowed to specify SQL that is to be interpreted as identifiers.

Advantages of Case-Insensitive Identifiers

There are also advantages associated with use of case-insensitive identifiers. It can be argued that case-insensitive identifiers are the "default" in Oracle database and PostgreSQL database because one must use quotes to specify when this default case-insensitivity is not the case.

  • Case-insensitivity is the "default" in Oracle and PostgreSQL databases.
  • The best case for readability can be used in any particular context. For example, allows DML and DDL statements to be written to a particular coding convention and then be automatically mapped to the appropriate case folding for various databases.
  • Avoids errors introduced by developers who are unaware of or unwilling to follow case conventions.
  • Double quotes (" ") are very different from single quotes (' ') in at least some contexts in both the Oracle and PostgreSQL databases and not using case-sensitive identifier double quotes eliminates need to remember the difference or worry about the next developer not remembering the difference.
  • Many of the above listed "advantages" may not really be good practices:
    • Using reserved words and keywords as identifiers is probably not good for readability anyway.
    • Using symbols allowed in quoted identifiers that are not allowed in unquoted identifiers may not be necessary or even desirable.
    • Having two different variables of the same name with just different characters cases is probably not a good idea.

Default Case-Insensitive or Quoted Case-Sensitive Identifiers?

In Don’t use double quotes in PostgreSQL, Reuven Lerner makes a case for using PostgreSQL's "default" (no double quotes) case-insensitive identifiers. Lerner also points out that pgAdmin implicitly creates double-quoted case-sensitive identifiers. From an Oracle DBA perspective, @MBigglesworth79 calls quoted identifiers in Oracle an Oracle Gotcha and concludes, "My personal recommendation would be against the use of quoted identifiers as they appear to cause more problems and confusion than they are worth."

A key trade-off to be considered when debating quoted case-sensitive identifiers versus default case-insensitive identifiers is one of being able to (but also required to) explicitly specify identifiers' case versus not being able to (but not having to) specify case of characters used in the identifiers.

Choose One or the Other: Don't Mix Them!

It has been my experience that the worst choice one can make when designing database constructs is to mix case-sensitive and case-insensitive identifiers. Mixing of these make it difficult for developers to know when case matters and when it doesn't, but developers must be aware of the differences in order to use them appropriately. Mixing identifiers with implicit case and explicit case definitely violates the Principle of Least Surprise and will almost certainly result in a frustrating runtime bug.

Another factor to consider in this discussion is case folding choices implemented in Oracle database and PostgreSQL database. This case folding can cause unintentional consequences, especially when porting between two databases with different case folding assumptions. The PostgreSQL database folds to lowercase characters (non-standard) while the Oracle database folds to uppercase characters. This significance of this difference is exemplified in one of the first PostgreSQL Wiki "Oracle Compatibility Tasks": "Quoted identifiers, upper vs. lower case folding." Indeed, while I have found PostgreSQL to be heavily focused on being standards-compliant, this case folding behavior is one place that is very non-standard and cannot be easily changed.

About the only "safe" strategy to mix case-sensitive and case-insensitive identifiers in the same database is to know that particular database's default case folding strategy and to name even explicitly named (double quoted) identifiers with exactly the same case as the database will case fold non-quoted identifiers. For example, in PostgreSQL, one could name all identifiers in quotes with completely lowercase characters because PostgreSQL will default unquoted identifiers to all lowercase characters. However, when using Oracle, the opposite approach would be needed: all quoted identifiers should be all uppercase to allow case-sensitive and case-insensitive identifiers to be intermixed. Problems will arise, of course, when one attempts to port from one of these databases to the other because the assumption of lowercase or uppercase changes. The better approach, then, for database portability between Oracle and PostgreSQL databases is to commit either to using quoted case-sensitive identifiers everywhere (they are then explicitly named the same for both databases) or to use default case-insensitive identifiers everywhere (and each database will appropriately case fold appropriately in its own approach).


There are advantages to both identifiers with implicit case (case insensitive) and to identifiers with explicit (quoted and case sensitive) case in both Oracle database and PostgreSQL database with room for personal preferences and tastes to influence any decision on which approach to use. Although I prefer (at least at the time of this writing) to use the implicit (default) case-insensitive approach, I would rather use the explicitly spelled-out (with double quotes) identifier cases in all cases than mix the approach and use explicit case specification for identifiers in some cases and implicit specification of case of identifiers in other cases. Mixing the approaches makes it difficult to know which is being used in each table and column in the database and makes it more difficult to port the SQL code between databases such as PostgreSQL and Oracle that make different assumptions regarding case folding.

Additional Reading

Tuesday, September 15, 2015

JDK 9: Highlights from The State of the Module System

Mark Reinhold's The State of the Module System (SOMS) was published earlier this month and provides an information-packed readable "informal overview of enhancements to the Java SE Platform prototyped in Project Jigsaw and proposed as the starting point for JSR 376." In this post, I summarize and highlight some of concepts and terms I found interesting while reading the document.

  • The State of the Module System states that a subset of the features discussed in the document will be used regularly by Java developers. These features and concepts are "module declarations, modular JAR files, module graphs, module paths, and unnamed modules."
  • A module is a "fundamental new kind of Java program component" that is "a named, self-describing collection of code and data."
  • "A module declares which other modules it requires in order to be compiled and run."
  • "A module declares which ... packages it exports" to other modules.
  • A module declaration is "a new construct of the Java programming language" that provides "a module’s self-description."
    • Convention is to place "source code for a module declaration" in a "file named module-info.java at the root of the module’s source-file hierarchy."
    • This module-info.java file specification of requires and exports is analogous to how OSGi uses the JAR MANIFEST.MF file to specify Import-Package and Export-Package.
  • "Module names, like package names, must not conflict."
  • "A module’s declaration does not include a version string, nor constraints upon the version strings of the modules upon which it depends."
  • "A modular JAR file is like an ordinary JAR file in all possible ways, except that it also includes a module-info.class file in its root directory."
  • "Modular JAR files allow the maintainer of a library to ship a single artifact that will work both as a module, on Java 9 and later, and as a regular JAR file on the class path, on all releases."
  • "The base module defines and exports all of the platform’s core packages," "is named java.base," is "the only module known specifically to the module system," "is always present," is depended upon by all other modules, and depends on no other modules.
  • All "platform modules" begin with the "java." prefix and "are likely to include "java.sql for database connectivity, java.xml for XML processing, and java.logging for logging."
  • The prefix "jdk." is applied to the names of "modules that are not defined in the Java SE 9 Platform Specification," but are "specific to the JDK."
  • Implied Readability: The keyword public can be added after the requires keyword to state that a given module's readable module can be read by dependent modules that read it. In other words, if module B references a package provided by module C as requires public, then that package is readable by module A that can read module B.
  • "The loose coupling of program components via service interfaces and service providers" is facilitated in the Java module system by use of the keywords provides ... with ... to indicate when a module provides an implementation of a service and by the use of the keyword uses to indicate when a module uses a provided service.
  • Because a given class is associated with a single module, Class::getModule() will allow access to a class's associated module.
  • "Every class loader has a unique unnamed module" from which types are loaded that are not associated with packages exposed by a module. A given class loader's unnamed module can be retrieved with new method ClassLoader::getUnnamedModule.
    • An unnamed module can read all other modules and can be read by all other modules.
    • Allows existing classpath-based applications to run in Java SE 9 (backwards compatibility).
  • "JMOD" is the "provisional" name for a "new artifact format" that "goes beyond JAR files" for holding "native code, configuration files, and other kinds of data that do not fit naturally ... into JAR files." This is currently implemented as part of the JDK and potentially could be standardized in Java SE at a later point.

The items summarized above don't include the "Advanced Topics" covered in "The State of the Module System" such as qualified exports, increasing readability, and layers. The original document is also worth reading for its more in-depth explanations, brief code listings, and illustrative graphics.

Project Jigsaw and OSGi

Project Jigsaw, like OSGi, aims for greater modularity in Java-based applications. I look forward to seeing if the built-in modularity support can provide some of the same advantages that OSGi provides while at the same time eliminating or reducing some of the disadvantages associated with OSGi. In the article Mule Drop OSGi For Being Too Complex, Jessica Thornsby has summarized some developers' thoughts regarding the perceived disadvantage of OSGi that have led Spring and Mule, among others, to stop using OSGi. The Thornsby article quotes Dmitry Sklyut, Kirk Knoerschild, and Ian Skerrett, who suggest that better tooling, better documentation (including by the community), better exposure at conferences, and more familiarity through use would help OSGi adoption and help overcome the perceived steep learning curve and complexity.

I will be curious to see if having modularity built-in to the Java platform will almost automatically bring some of the things that OSGi advocates have argued would increase OSGi's adoption. I suspect that Project Jigsaw, by being built into the platform will have better tooling support, better exposure to general Java developers, and will be more widely and generally covered in the Java developer community (blogs, conferences, books, etc.). With these advantages, I also wonder if Java 9 and Jigsaw will cause current users of OSGi to move away from OSGi or if those users will find creative ways to use the two together or will do what they can (such as use of unnamed modules) to use OSGi instead of Jigsaw. Because OSGi works on versions of Java prior to Java 9 and Jigsaw will only work on Java 9 and later, there will probably be no hurry to move OSGi-based applications to Jigsaw until Java 9 adoption heats up. An interesting discussion on current and forthcoming Java modularity approaches is available in Modularity in Java 9: Stacking up with Project Jigsaw, Penrose, and OSGi.

Cited / Related Resources

Saturday, September 12, 2015

JAR Manifest Class-Path is Not for Java Application Launcher Only

I've known almost since I started learning about Java that the Class-Path header field in a Manifest file specifies the relative runtime classpath for executable JARs (JARs with application starting point specified by another manifest header called Main-Class). A colleague recently ran into an issue that surprised me because it proved that a JAR file's Manifest's Class-Path entry also influences the compile-time classpath when the containing JAR is included on the classpath while running javac. This post demonstrates this new-to-me nuance.

The section "Adding Classes to the JAR File's Classpath" of the Deployment Trail of The Java Tutorials states, "You specify classes to include in the Class-Path header field in the manifest file of an applet or application." This same section also states, "By using the Class-Path header in the manifest, you can avoid having to specify a long -classpath flag when invoking Java to run the your application." These two sentences essentially summarize how I've always thought of the Class-Path header in a manifest file: as the classpath for the containing JAR being executed via the Java application launcher (java executable).

It turns out that the Class-Path entry in a JAR's manifest affects the Java compiler (javac) just as it impacts the Java application launcher (java). To demonstrate this, I'm going to use a simple interface (PersonIF), a simple class (Person) that implements that interface, and a simple class Main that uses the class that implements the interface. The code listings are shown next for these.

public interface PersonIF
   void sayHello();
import static java.lang.System.out;

public class Person implements PersonIF
   public void sayHello()
public class Main
   public static void main(final String[] arguments)
      final Person person = new Person();

As can be seen from the code listings above, class Main depends upon (uses) class Person and class Person depends upon (implements) PersonIF. I will intentionally place the PersonIF.class file in its own JAR called PersonIF.jar and will store that JAR in a (different) subdirectory. The Person.class file will exist in its own Person.jar JAR file and that JAR file includes a MANIFEST.MF file with a Class-Path header referencing PersonIF.jar in the relative subdirectory.

I will now attempt to compile the Main.class from Main.java with only the current directory on the classpath. I formerly would have expected compilation to fail when javac would be unable to find PersonIF.jar in a separate subdirectory. However, it doesn't fail!

This seemed surprising to me. Why did this compile when I had not explicitly specified PersonIF.class (or a JAR containing it) as the value of classpath provided via the -cp flag? The answer can be seen by running javac with the -verbose flag.

The output of javac -verbose provides the "search path for source files" and the "search path for class files". The "search path for class files" was the significant one in this case because I had moved the PersonIF.java and Person.java source files to a completely unrelated directory not in those specified search paths. It's interesting to see that the search path for class files (as well as the search path for source files) includes archive/PersonIF.jar even though I did not specify this JAR (or even its directory) in the value of -cp. This demonstrates that the Oracle-provided Java compiler considers the classpath content specified in the Class-Path header of the MANIFEST.MF of any JAR on specified on the classpath.

The next screen snapshot demonstrates running the newly compiled Main.class class and having the dependency PersonIF.class picked up from archive/PersonIF.jar without it being specified in the value passed to the Java application launcher's java -cp flag. I expected the runtime behavior to be this way, though admittedly I had never tried it or even thought about doing it with a JAR whose MANIFEST.MF file did not have a Main-Class header (non-executable JAR). The Person.jar manifest file in this example did not specify a Main-Class header and only specified a Class-Path header, but was still able to use this classpath content at runtime when invoked with java.

The final demonstration for this post involves removing the Class-Path header and associated value from the JAR file and trying to compile with javac and the same command-line-specified classpath. In this case, the JAR containing Person.class is called Person2.jar and the following screen snapshot demonstrates that its MANIFEST.MF file does not have a Class-Path header.

The next screen snapshot demonstrates that compilation with javac fails now because, as expected, PersonIF.class is not explicitly specified on the classpath and is no longer made available by reference from the MANIFEST.MF Class-Path header of a JAR that is on the classpath.

We see from the previous screen snapshot that the search paths for source files and for class files no longer include archive/PersonIF.jar. Without that JAR available, javac is unable to find PersonIF.class and reports the error message: "class file for PersonIF not found."

General Observations

  • The Class-Path header in a MANIFEST.MF file has no dependency on the existence of a Main-Class header existing in the same JAR's MANIFEST.MF file.
    • A JAR with a Class-Path manifest header will make those classpath entries available to the Java classloader regardless of whether that JAR is executed with java -jar ... or is simply placed on the classpath of a larger Java application.
    • A JAR with a Class-Path manifest header will make those classpath entries available to the Java compiler (javac) if that JAR is included in the classpath specified for the Java compiler.
  • Because the use of Class-Path in a JAR's manifest file is not limited in scope to JARs whose Main-Class is being executed, class dependencies can be potentially inadvertently satisfied (perhaps even with incorrect versions) by these rather than resolving explicitly specified classpath entries. Caution is advised when constructing JARs with manifests that specify Class-Path or when using third-party JARs with Class-Path specified in their manifest files.
  • The importance of the JAR's manifest file is sometimes understated, but this topic is a reminder of the usefulness of being aware of what's in a particular JAR's manifest file.
  • This topic is a reminder of the insight that can be gleaned from running javac now and then with the -verbose flag to see what it's up to.
  • Whenever you place a JAR on the classpath of the javac compiler or the java application launcher, you are placing more than just the class definitions within that JAR on the classpath; you're also placing any classes and JARs referenced by that JAR's manifest's Class-Path on the classpath of the compiler or application launcher.


There are many places from which a Java classloader may load classes for building and running Java applications. As this post has demonstrated, the Class-Path header of a JAR's MANIFEST.MF file is another touch point for influencing which classes the classloader will load both at runtime and at compile time. The use of Class-Path does not affect only JARs that are "executable" (have a Main-Class header specified in their manifest file and run with java -jar ...), but can influence the loaded classes for compilation and for any Java application execution in which the JAR with the Class-Path header-containing manifest file lies on the classpath.

Friday, September 11, 2015

Passing Arrays to a PostgreSQL PL/pgSQL Function

It can be handy to pass a collection of strings to a PL/pgSQL stored function via a PostgreSQL array. This is generally a very easy thing to accomplish, but this post demonstrates a couple of nuances to be aware of when passing an array to a PL/pgSQL function from JDBC or psql.

The next code listing is for a contrived PL/pgSQL stored function that will be used in this post. This function accepts an array of text variables, loops over them based on array length, and reports these strings via the PL/pgSQL RAISE statement.

CREATE OR REPLACE FUNCTION printStrings(strings text[]) RETURNS void AS $printStrings$
   number_strings integer := array_length(strings, 1);
   string_index integer := 1;
   WHILE string_index <= number_strings LOOP
      RAISE NOTICE '%', strings[string_index];
      string_index = string_index + 1;
$printStrings$ LANGUAGE plpgsql;

The above PL/pgSQL code in file printStrings.sql can executed in psql with \ir as shown in the next screen snapshot.

The syntax for invoking a PL/pgSQL stored function with an array as an argument is described in the section "Array Value Input" in the PostgreSQL Arrays documentation. This documentation explains that "general format of an array constant" is '{ val1 delim val2 delim ... }' where delim is a delimited of comma (,) in most cases. The same documentation shows an example: '{{1,2,3},{4,5,6},{7,8,9}}'. This example provides three arrays of integral numbers with three integral numbers in each array.

The array literal syntax just shown is straightforward to use with numeric types such as the integers in the example shown. However, for strings, there is a need to escape the quotes around the strings because there are already quotes around the entire array ('{}'). This escaping is accomplished by surrounding each string in the array with two single quotes on each side. For example, to invoke the stored function just shown on the three strings "Inspired", "Actual", and "Events", the following syntax can be used in psql: SELECT printstrings('{''Inspired'', ''Actual'', ''Events''}'); as shown in the next screen snapshot.

Arrays can be passed to PL/pgSQL functions from Java code as well. This provides an easy approach for passing Java collections to PL/pgSQL functions. The following Java code snippet demonstrates how to call the stored function shown earlier with JDBC. Because this stored function returns void (it's more like a stored procedure), the JDBC code does not need to invoke any CallableStatement's overridden registerOutParameter() methods.

JDBC Code Invoking Stored Function with Java Array
final CallableStatement callable =
   connection.prepareCall("{ call printstrings ( ? ) }");
final String[] strings = {"Inspired", "Actual", "Events"};
final Array stringsArray = connection.createArrayOf("varchar", strings);
callable.setArray(1, stringsArray);

Java applications often work more with Java collections than with arrays, but fortunately Collection provides the toArray(T[]) for easily getting an array representation of a collection. For example, the next code listing is adapted from the previous code listing, but works against an ArrayList rather than an array.

JDBC Code Invoking Stored Function with Java Collection
final CallableStatement callable =
   connection.prepareCall("{ call printstrings ( ? ) }");
final ArrayList<String> strings = new ArrayList<>();
final Array stringsArray =
      strings.toArray(new String[strings.size()]));
callable.setArray(1, stringsArray);


The ability to pass an array as a parameter to a PostgreSQL PL/pgSQL stored function is a straightforward process. This post specifically demonstrated passing an array of strings (including proper escaping) to a PL/pgSQL stored function from psql and passing an array of Strings to a PL/pgSQL stored function from JDBC using java.sql.Array and Connection.createArrayOf(String, Object[]).

Saturday, September 5, 2015

The Latest Twist in the Java IDE Wars: Subscription-based IntelliJ IDEA

I've been watching with some interest the discussion surrounding this past week's announcement that JetBrains is moving to an Adobe-like and Microsoft Office-like software subscription licensing model. The feedback has been fast and furious and JetBrains has responded with a very brief follow-up post We are listening that is entirely reproduced in this quote: "We announced a new subscription licensing model and JetBrains Toolbox yesterday. We want you to rest assured that we are listening. Your comments, questions and concerns are not falling on deaf ears. We will act on this feedback."

There are several places to see the reaction, positive and negative, to this announcement. The feedback comments on both the original announcement post and on the follow-up post are good places to start. There are also Java subreddit threads JetBrains switches to subscription model for tools (101 comments currently) and Are you sticking with IntelliJ IDEA or you are moving on to another IDE? (180 comments currently). I was going to summarize some of the pros and cons of this announcement, but Daniel Yankowsky has done such a good job of this that I'll simply reference his post How JetBrains Lost Years of Customer Loyalty in Just a Few Hours.

After reading these posts, it is clear that the change announced by JetBrains would benefit some consumers but might cost other consumers more and, in some cases, quite a bit more. It all seems to depend on what each individual user actually uses and how he or she uses it. In many ways, this makes me think of my most recent Microsoft Office purchase. I purchased Microsoft Office 2013 for my new PC outright rather than via subscription. It would take 2-3 years of subscription payments to meet or exceed the one-time payment I made for Office, but I anticipate needing few new features in Office in the life of this computer. In fact, I have older versions of Microsoft Office running on older computers and am happy with them. It seems that subscriptions to any software product benefit those who need or strongly desire new functions and features and are more expensive for those who are happy with the current functionality set of a particular product. I personally like that Microsoft allows consumers to choose and either buy the software license outright or use a subscription. Having choice in the matter seems to be what's really best for the consumer.

JetBrains provides numerous tools for a wide variety of programming languages and frameworks, but there are also "substitute products" available for most of these. For example, in the Java world, IDEA competes with freely available open source competitors NetBeans and Eclipse. There is, of course, JetBrains's own freely available Community Edition of IDEA that can be seen as a substitute for the Java developer.

The fact that JetBrains can sell its IDEA IDE when there are good alternative Java IDEs available in NetBeans and Eclipse is evidence that many Java developers like what IntelliJ IDEA has to offer enough to pay for it. I am interested to see how the subscription model will affect this balance. JetBrains's relatively quick follow-up announcement indicates that they are considering at least some tweaks to the announced policy already.

Some of the arguments being made against the announcement seem to be out of principle rather than against the actual cost. The JetBrains Toolbox costs are shown currently to be $19.90/month (USD) for existing users to use all tools, just under $12 (USD) for existing uses of IntelliJ IDEA Ultimate Edition, and just under $8/month (USD) for existing users of WebStorm. Those who are not existing users can expect to pay a bit more per month.

One mistake I think was made in the announcement is typical of consumer-facing companies: presenting a change that has obvious benefits for the vendor as if the change is being made primarily for the benefit of the consumer. Although there seems to be benefits for many of the consumers of this announcement, these changes also do not benefit a large number of consumers. There's no question these changes have advantages for JetBrains and it does seem difficult to believe that the changes were driven more by consumers' interests than out of the company's interests. It is not unusual for companies to present changes that benefit themselves as being made primarily in the consumers' interest, but that doesn't mean we like it to hear it presented like that. I think this tone can lead to people reacting negatively to the announcement out of principle.

There are also some interesting articles on software subscription pricing models. Joshua Brustein wrote in Adobe's Controversial Subscription Model Proves Surprisingly Popular that "[Adobe] is making more money selling monthly subscriptions to its Creative Cloud software—the family of programs that includes Photoshop and Illustrator—than it is by selling the software outright." In Software Makers' Subscription Drive, Sam Grobart wrote that "in 2013 consumer software companies proved they could pull off the switch from one-time software purchases to an online subscriber model that costs customers more long term." That article discusses the advantages and disadvantages of consumer software subscriptions from users' and sellers' perspectives.

I'll be watching developments related to JetBrains's announcement in the short-term and long-term to see what effects result from the change in licensing. I'm particularly interested in how this affects IntelliJ IDEA in the Java IDE/text editor space where IntelliJ IDEA Ultimate will continue to compete with IntelliJ IDEA Community Edition, NetBeans, Eclipse, JDeveloper, Sublime, and more.

Monday, August 31, 2015

Book Review: JavaScript: The Good Parts

From the perspective of a book on a programming language that is frequently quoted with reverence by developers that regularly use that programming language, it could be argued that Java Script: The Good Parts is to JavaScript developers as Effective Java is to Java developers. The subtitle of Douglas Crockford's JavaScript: The Good Parts is "Unearthing the Excellence in JavaScript." I finally purchased and read JavaScript: The Good Parts (O'Reilly/Yahoo! Press, 2008) and this is my review of that book. This is more than a review, however, in that it also provides me a forum to highlight some of observations from that book that most interested me.

JavaScript: The Good Parts is a relatively short book with ten chapters and five appendices spanning fewer than 150 pages. It's impressive how much content can be squeezed into 150 pages and is a reminder that the best writing (in prose and in code) is often that which can say more in fewer words. I was able to read all of the chapters and the two appendices that interested me the most during a flight that took a little more than an hour (although my reading started as soon as I was seated on the airplane). There is one caveat to this: although some of JavaScript: The Good Parts is a very quick read for anyone with basic familiarity with JavaScript, other portions required me to re-read them or even tell myself, "I better come back to that later and read it again." This is another way in which this book reminds me of Effective Java.


The first page of the Preface is the core content of that section and it provides an overview of what a reader of JavaScript: The Good Parts should expect. The Preface describes JavaScript as "a surprisingly powerful language" which has some "unconventionality" that "presents some challenges," but is also a "small language" that is "easily mastered." One category of developer to which this book is targeted is "programmers who have been working with JavaScript at a novice level and are now ready for a more sophisticated relationship with the language." That sounds like me!

Crockford uses the Preface to describe what JavaScript: The Good Parts covers. He states, "My goal here is to help you learn to think in JavaScript." He also points out that JavaScript: The Good Parts "is not a book for beginners," "is not a reference book," "is not exhaustive about the language and its quirks," "is not a book for dummies," and "is dense."

Chapter 1: Good Parts

In the initial chapter of JavaScript: The Good Parts, Crockford points out that programming languages have "good parts and bad parts" and that "JavaScript is a language with more than its share of bad parts." He points out that these deficiencies are largely due to the short amount of time in which JavaScript was created and articulates, "JavaScript's popularity is almost completely independent of its qualities as a programming language." Crockford has found that a developer can write better programs in any language by only using the good parts of that language as much as possible. This seems to be particularly true with JavaScript. Crockford provides a high-level description of JavaScript's good parts:

"JavaScript has some extradordinarily good parts. In JavaScript, there is a beautiful, elegant, highly expressive language that is buried under a steaming pile of of good intentions and blunders."

In the section "Analyzing JavaScript," Crockford surveys the "very good ideas" that JavaScript is built upon along with the "few very bad" ideas that JavaScript is built upon. This first chapter is only 4 pages and the overview of these good and bad ideas is contained in a couple of pages. However, the remaining chapters of the book provide more details on the good parts and the first two appendices provide more details on the bad parts.

I agree with Crockford's assertion that a significant portion of the negativity and even hostility toward JavaScript is probably more appropriately aimed at the DOM. JavaScript has probably been accused of being non-standard and browser-specific millions of times when it is really the browser's DOM implementation that is non-standard and browser-specific.

I cannot finish my review of the first chapter of JavaScript: The Good Parts without quoting one more astute quote: "[Given JavaScript's] many errors and sharp edges, ... 'Why Should I Use JavaScript?' There are two answers. The first is that you don't have a choice. ... JavaScript is the only language found in all browsers. ... The other answer is that, despite its deficiencies, JavaScript is really good."

Chapter 2: Grammar

The second chapter of JavaScript: The Good Parts provides 15 pages of introduction to the "grammar of the good parts of JavaScript, presenting a quick overview of how the language is structured." For someone who has used JavaScript previously, much of this chapter may not be particularly insightful, though just seeing the parts of the language that Crockford feels are "good" is useful. The section on "Statements" points out early some "unconventional" aspects of JavaScript: code blocks delineated by curly braces do not limit scope to those blocks and variables should be defined at the beginning of a function rather than at first use.

Chapter 3: Objects

The 6 pages of Chapter 3 introduce JavaScript Objects. Significant aspects of JavaScript objects (key/value pair nature, prototype object association, pass-by-reference, objection inspection with typeof and hasOwnProperty, and reducing an object's "global footprint") are covered succinctly.

Chapter 4: Functions

Chapter 4 of JavaScript: The Good Parts begins with the statements, "The best thing about JavaScript is its implementation of functions. It got almost everything right. But, as you should expect with JavaScript, it didn't get everything right." This chapter is longer (20 pages) than the ones before it, reinforcing that Crockford believes functions are one of the really good parts of JavaScript. Despite its being lengthier than the preceding chapters, Chapter 4 seems to me to also be more dense (particularly than Chapters 2 and 3).

Chapter 4's coverage of JavaScript functions point out one of the differences in JavaScript I needed to come to terms with to feel more confident with the language: "Functions in JavaScript are objects." The section on function invocation briefly describes the four patterns of invocation in JavaScript (method invocation, function invocation, constructor invocation and apply invocation) and explains how this is initialized differently depending on the particular pattern of invocation used. JavaScript's different meaning of this depending on context has been one of the more difficult aspects of working with JavaScript after coming from a Java background, but this explanation is the clearest and most easy to remember that I have read.

The fourth chapter covers exception handling, method cascading, and type augmentation. The section on "Augmenting Types" presents multiple examples of adding "significant improvements to the expressiveness of the language" by "augmenting the basic types" via addition of methods to appropriate prototypes.

The sections on "Recursion," "Closure," and "Module" are where things got a bit dense for me and I needed to read several portions of these sections more than once to more fully appreciate the points being made. I believe I still have a ways to go to understand these concepts completely, but I also believe that understanding them well and implementing the module concept presented here is the key to happiness in large-scale JavaScript development.

The "Curry" section of Chapter 4 states that JavaScript lacks a curry method, but explains how to address that by associating a curry method with Function. The "Memoization" section demonstrates how to use memoization in JavaScript so that "functions can use objects to remember the results of previous operations, making it possible to avoid unnecessary work."

Chapter 5: Inheritance

JavaScript: The Good Parts's fifth chapter begins by briefly explaining the two "useful services" that inheritance provides in "classical languages (such as Java)": code reuse and type system. It is explained that JavaScript is dynamically typed and therefore gains a single advantage from inheritance: code reuse. Crockford states that "JavaScript provides a much richer set of code reuse patterns" than the "classical pattern."

The "Pseudoclassical" section of Chapter 5 begins with the assertion that "JavaScript is conflicted about its prototypal nature." There is in-depth discussion about the dangeris and drawbacks of using the constructor invocation pattern. The most "serious hazard" occurs when a developer forgets to use new when calling the constructor function. Crockford warns that in such cases, this is associated with the global object rather than the (likely) intended new object. The author states that convention is to use uppercase for the first letter of the "constructor function" objects" to indicate this risk, but he advises that the better course is to not use new or the constructor invocation pattern at all.

This discussion in the "Pseudoclassical" section of Chapter 5 provides more detail on issues Crockford raised with the "constructor invocation pattern" in Chapter 4. These two sections forced me to acknowledge that while I've liked using the constructor invocation pattern in JavaScript, it's only because "the pseudoclassical form can provide comfort to developers who are unfamiliar with JavaScript." Crockford warns that its use "hides the true nature of the language."

Chapter 5 introduces object specifiers and dives into coverage of JavaScript's prototypal implementation and differential inheritance. The "Functional" section of this fifth chapter illustrates how to use a functional approach to reuse and states that this functional approach "requires less effort than the pseudoclassical pattern and gives us better encapsulation and information hiding and access to super methods." The fifth chapter concludes with discussion and code example of composing objects "out of sets of parts."

Chapter 6: Arrays

The 6-page sixth chapter of JavaScript: The Good Parts introduces the concept of an array and mentions a couple of its benefits, but laments, "Unfortunately, JavaScript does not have anything like this kind of array." The author describes what JavaScript offers as "an object that has some array-like characteristics." He points out that this array-like object is "significantly slower than a real array, but it can be more convenient to use."

Chapter 6 discusses JavaScript's "unconventional" length property for JavaScript "arrays" and introduces syntax for accessing elements, push, and delete. Crockford points out that "JavaScript does not have a good mechanism for distinguishing between arrays and objects" and he provides two brief implementations of is_array functions (the second relies on toString() not being overridden).

The sixth chapter wraps up with discussion regarding adding methods to JavaScript's Array. Specific code examples include a function thar initializes the delements of a JavaScript array and a function that initializes the elements of a matrix (array of arrays).

Chapter 7: Regular Expressions

The nearly 23 pages of JavaScript: The Good Parts's seventh chapter focus on applying regular expressions in JavaScript. For those who have used other implementations of regular expressions (particularly Perl's or implementations based on Perl's), this will be fairly familiar.

Crockford points out several motivations for keeping regular expressions simple, but a JavaScript-specific motivation for simpler regular expressions that he cites has to do with lack of portability between different JavaScript language processors' regular expression support.

Chapter 7 introduces two forms of creating regular expressions in JavaScript: literals (/ syntax) and RegExp constructor. The chapter also introduces other JavaScript syntax for working with various regular expression concepts in JavaScript.

Chapter 8: Methods

The 15+ pages of Chapter 8 of JavaScript: The Good Parts feel like an API reference and reminds of me books such as Java in a Nutshell. These pages summarize the "small set of standard methods that are available on the standard types" in JavaScript. The chapter lists the method signature, brief method description, and examples of using that method for standard methods defined on Array, Function, Number, Object, RegExp, and String. Although these are nice summary descriptions and example usages, this chapter may be the least useful chapter of the book given that these APIs are documented online in sites such as the Mozilla Developer Network's JavaScript Reference.

Chapter 9: Style

JavaScript: The Good Parts's most (pleasantly) surprising chapter for me may be Chapter 9. When I was browsing the table of contents and saw "Style," I thought this chapter would be another bland spelling out of what to do and not do stylistically in code. I'm tired of these stylistic discussions. The chapter is fewer than 4 pages, so I did not expect much.

It turns out that the ninth chapter has some important observations in its just over three pages on style. I like that Crockford takes the reasons for style concerns with any programming language and emphasizes that they are particularly important in JavaScript.

My favorite part of Chapter 9 is when Crockford explains his style used in the book for JavaScript code. Some of it is the bland matter-of-taste stuff like number of spaces for indentation, but some of it is motivated by an understanding of JavaScript nuances and limitations. For example, Crockford states, "I always use the K&R style putting the { at the end of a line instead of the front, because it avoids a horrible design blunder in JavaScript's return statement." Similarly, he points out that he declares variables at the beginning of a function and prefers line comments over block comments because of other nuances of JavaScript. He (and my review) covers these more in the appendices.

This second-to-last chapter offers some poignant advice regarding coding style and large-scale JavaScript applications:

"Quality was not a motivating concern in the design, implementation, or standarization of JavaScript. That puts a greater burden on the users of the language to resist the language's weaknesses. JavaScript provides support for large programs, but it also provides forms and idioms that work against large programs.

Chapter 10: Beautiful Features

JavaScript's bad features are the focus of Appendix A ("Awful Parts" that "are not easily avoided") and Appendix B ("problematic features" that "are easily avoided"), but Crockford focuses Chapter 10 on what he considers JavaScript's "beautiful features." Because this is the theme of the book, this chapter only needs a bit over 2 pages to highlight Crockford's concept of "Simplified JavaScript": taking the "best of the Good Parts" of JavaScript, removing the features of the language with very little or even negative value, and adding a few new features (such as block scoping, perhaps the thing I miss most in JavaScript).

Appendix A: Awful Parts

Appendix A highlights the "problematic features of JavaScript that are not easily avoided" in just over 7 pages. Crockford warns, "You must be aware of these things and be prepared to cope."

The body of the appendix opens with an assertion that's difficult to argue with: "The worst of all of JavaScript's bad features is its dependence on global variables." I also like that Crockford points out that while many programming languages "have global variables," the problem with JavaScript is that it "requires them."

Appendix A also highlights why JavaScript's handling of reserved words, lack of block scope, 16-bit unicode support, typeof limitations, parseInt without explicit radix, confusion of + for adding or concatenating, "phony" arrays, and a few other features are problematic and how to avoid or reduce their use.

Perhaps the most interesting discussion for me in Appendix A is the explanation of why JavaScript may somtimes insert semicolons and, instead of fixing things, will make things worse (mask more significant code issues).

Appendix B: Bad Parts

The six pages of Appendix B "present some of the problematic features of JavaScript that are easily avoided." The chapter details why JavaScript features such as ==, with, continue, falling through switch, statements without blocks, bitwise operators, typed wrappers (and new Object and new Array), and void should be avoided.

Appendix C: JSLint

Appendix C provides 10 pages focused on JSLint, described as "a JavaScript syntax checker and verifier." About JSLint, Crockford states, "JSLint defines a professional subset of JavaScript ... related to the style recommendations from Chapter 9. JavaScript is a sloppy language, but inside it there is an elegant, better language. JSLint helps you to program in that better language and to avoid most of the slop."

Appendix C details how JSLint helps JavaScript developers identify global variables and functions, identify potentially misspelled members (used only once because misspelled but JavaScript itself won't report), identify missing or extraneous semicolons, identify potential issues of automatic semicolon insertion due to improper line breaking, and identify block statements missing opening and closing curly braces. Other items flagged by JSLint include fall-though switch statements, use of with, assignment operator used in a conditional expression, potential JavaScript type coercion with == and !=, eval, void, bitwise operators, potentially non-portable regular expressions, and constructor functions.

The chapter also demonstrates how to specify to JSLint the "subset of JavaScript that is acceptable." In other words, one can choose to not have certain conditions flagged by JSLint. I find it interesting that JSLint provides some HTML validation in addition to checking for well-formed JSON.

I have found that static code analysis tools for Java not only help improve existing Java code, but help me write better Java code in the future as I learn what is considered wrong or bad form, why it is wrong or frowned upon, and how to avoid it. The same is true for JSLint's effect on JavaScript; a person learning JavaScript can benefit from learning what JSLint flags to know the bad/ugly parts of JavaScript to avoid..

Appendix D: Syntax Diagrams

The fourth appendix consists solely of syntax diagrams that graphically indicate how various JavaScript constructs are syntactically constructed. The diagrams are of the portions of JavaScript highlighted in JavaScript: The Good Parts. Appendix D is a reference guide similar to Chapter 8 and, like Chapter 8, is probably the least valuable of the book's appendices because it is information that is readily available online.

Appendix E: JSON

The final ten pages of the book are in Appendix E and are dedicated to JavaScript Object Notation (JSON). The appendix describes JSON as "based on JavaScript's object literal notation, one of JavaScript's best parts." This introduction explains that JSON is a text-based format that is a subset of JavaScript but can also be used as a language-independent data transfer format. Most of the material in this appendix was obviously a lot newer to people in 2008 when this book was published than it is today because today many developers who don't even know JavaScript very well are aware of JSON.

Appendix F describes the syntax rules of JSON in approximately a single page because "JSON's design goals were to be minimal portable, textual, and a subset of JavaScript."

The section of Appendix F on "Using JSON Securely" looks at the risks of using JavaScript's eval to turn JSON into a useful JavaScript data structure and recommends use of JSON.parse instead. There is also interesting discussion on security implications of assigning an HTML text fragment sent by the server to an HTML element's innerHTML property. What makes this interesting is Crockford's pointing out that this security issue has nothing to do with Ajax, XMLHttpRequest, or JSON, but is rather due to the core JavaScript design flaw of featuring a global object. Crockford takes one more shot at this "feature": "This danger is a direct consequence of JavaScript's global object, which is far and away the worse part of JavaScript's many bad parts. ... These dangers have been in the browser since the inception of JavaScript, and will remain until JavaScript is replaced. Be careful."

The last 5 1/2 pages of Appendex F feature a code listing for a JSON parser written in JavaScript.

General Observations

  • JavaScript: The Good Parts deserves the praise and reverence heaped upon it; it is a great book and I cannot think of a JavaScript book that I've read that has done as much for my understanding of this unconventional language as JavaScript: The Good Parts.
  • Many technology books rave about the covered language, framework, or library and either don't acknowledge the deficiencies and downsides of the covered item or quickly explain them away as insignificant or inconsequential. JavaScript: The Good Parts is more effective because it doesn't do this. Instead, Crockford's writing makes it obvious that there are many aspects of JavaScript he likes and finds expressive, but that he also recognizes its downsides. His book is an attempt to teach how to mostly use only good parts of JavaScript and mostly avoid use of the bad parts of JavaScript.
  • Because Crockford takes time to explain JavaScript's unconventional features and distinguish between cases where the unconventional approach is "good" and cases where unconventional approach is "bad," readers of the book have a better opportunity to appreciate JavaScript's positives rather than mostly seeing its negatives.
  • JavaScript: The Good Parts reinforces the idea that trying to treat JavaScript like Java (or any other classically-object-oriented language) is a mistake. It explains why this approach often leads to frustration with JavaScript.
  • JavaScript: The Good Parts is a highly-readable and generally approachable book. The (English) language of the book is clear and well-written. The conciseness is impressive, especially considering that some of the book's most important points are made multiple times in different contexts and the entire book has fewer than 150 main pages.
    • Although JavaScript: The Good Parts is written in a very readable form, some portions of it are more difficult to read because the content is more difficult. This is particularly true in some of the sections in Chapter 4 on functions. Some of these sections required multiple readings for me, but they are also the sections that bring the most insight when understood.
    • One reason that JavaScript: The Good Parts can be as concise as it is has to do with it providing so little introduction. Those who have never coded before or have no JavaScript coding experience, will likely be better off reading a more introductory book or online resource first.
  • Several useful JavaScript code snippets are provided in JavaScript: The Good Parts to illustrate good and bad parts of JavaScript. Along the way, several pieces of code are provided that are generic and reusable and worth highlighting here:
    • Chapter 3 (page 22) provides 6 lines of code for associating a "create method" with the Object function that "creates a new object that uses an old object for its protype."
    • Chapter 4 (pages 32-33) provides 4 lines of code for making a named method available to all functions. A slightly revised version of this is presented one page later with the addition of a "defensive technique" to ensure that a newly defined method does not override one that another library is already using.
    • Chapter 4 (page 33) provides 3 lines of code for adding an integer to Number.prototype that "extracts just the integer part of a number."
    • Chapter 4 (page 33) provides 3 lines of code for adding trim method to String.prototype that "removes spaces from the ends of a string."
    • Chapter 4 (page 44) provides 8 lines of code for adding a curry method to Function.
    • Chapter 4 (page 45) provides 11 lines of code that implement a geeneralized funtion for generation of memoized functions.
    • Chapter 5 (page 54) provides 7 line of code that implement a superior method that "takes a method name and returns a function that invokes that method."
    • Chapter 6 (page 61) provides two brief implementations of is_array functions for determining if a given JavaScript item is an array.
    • Chapter 6 (page 63) provides an implementation of a dim method on arrays that initializes all elements of an array.
    • Chapter 6 (pages 63-64) provides an implementation of a matrix method on Array that initializes all elements of arrays nested within array.
    • Appendix F (pages 140-145) provides an implementation of a "simple, recursive decent [JSON] parser" to generate a JavaScript data structure from JSON text.
  • A book such as JavaScript: The Good Parts is necessarily opinionated (same applies to the excellent Effective Java). I like it in this case because it's not one-sided, rose-colored glasses opinions, but rather expresses opinions of both JavaScript's good and bad parts. Not all opinions are created equal. In this case, author Douglas Crockford brings great credibility to back his opinions. His involvement with JSLint and JSON alone speak volumes for his experience with and knowledge of JavaScript. Opinionated books written by inexperienced individuals are not likely to be very valuable, but an opinionated book by an experienced developer is often among the most valuable of technical books.


JavaScript: The Good Parts is one of those relatively rare technical books that is very hyped and lives up to that hype. It helps the reader to understand how to use the best parts of JavaScript and avoid or reduce exposure to the bad parts of JavaScript. In the process of doing this, it does help the reader to do exactly what the author is trying to accomplish: to think in JavaScript. JavaScript: The Good Parts condenses significant low-level details and important high-level language design discussion into fewer than 150 pages.