Determine if a String is XML using Java and Regular Expressions

So again I am posting something I have to do every now and then and have to spend time, each time, to check the pattern or usage etc. for.

Once in a while, in an app that does not do much XML, and therefore is not already using an XML parser of some kind, will need to at the least, determine if a String is XML. With a pretty simple Regular Expression, it is possible using plain old Java and without using any specific XML technology.

I know there are other references out there for doing this, but it is here below as a code sample, for my easy reference and maybe it will help someone else out, who knows. Enjoy.

Are we XML (like) data? :

import java.util.regex.Pattern;
import java.util.regex.Matcher;


public class test {



    /**
     * return true if the String passed in is something like XML
     *
     *
     * @param inString a string that might be XML
     * @return true of the string is XML, false otherwise
     */
    public static boolean isXMLLike(String inXMLStr) {

        boolean retBool = false;
        Pattern pattern;
        Matcher matcher;

        // REGULAR EXPRESSION TO SEE IF IT AT LEAST STARTS AND ENDS
        // WITH THE SAME ELEMENT
        final String XML_PATTERN_STR = "<(\\S+?)(.*?)>(.*?)</\\1>";



        // IF WE HAVE A STRING
        if (inXMLStr != null && inXMLStr.trim().length() > 0) {

            // IF WE EVEN RESEMBLE XML
            if (inXMLStr.trim().startsWith("<")) {

                pattern = Pattern.compile(XML_PATTERN_STR,
                Pattern.CASE_INSENSITIVE | Pattern.DOTALL | Pattern.MULTILINE);

                // RETURN TRUE IF IT HAS PASSED BOTH TESTS
                matcher = pattern.matcher(inXMLStr);
                retBool = matcher.matches();
            }
        // ELSE WE ARE FALSE
        }

        return retBool;
    }



}/**/

DOM Document – get or extract contained document (or Node) as XML Source

Something I have to do every once in a while, and can never remember how (especially when under some tight deadline, with people standing over my shoulder asking “is it done yet, is it done?” “how much longer?” etc.) is to extract a fragment of one DOM document to get the XML source of the nested or contained document. So I am going to add a note here, for everyone’s easy reference.

First step is to get a Node to be the Root Node of the new Document. Using methods like Document’s getElementsByTagName(String) and Node.getChildNodes(), or using XPathAPIs and CachedXPathAPI class’ selectSingleNode(Node n, String xPath).

Next we can use a StringWriter and a Transformer to covert the Node to XML Source. Better than a rambling explanation, a simple source example should be do the trick. You can use a method something like the nodeToXMLString example below.

  private String nodeToXMLString(Node node) throws TransformerException
  {
    StringWriter sw = new StringWriter();

    Transformer serializer = TransformerFactory.newInstance().newTransformer();
    serializer.transform(new DOMSource(node), new StreamResult(sw));

    return (sw.toString());
  }

Do a retroactive branch in CVS

You have a source tree that is not branched, and you suddenly need to get a previous version (maybe it is in production at that point, but the trunk has moved on since and is waaaaaaaaaaaay out of sync). What to to? Well if you could branch from back then, back in time to that revision …. well you get the idea, and I am calling this a retroactive branch, because instead of the branch being a the current revision or HEAD, it is from some date/revision/point in the past.

This is something that I sometimes find I must do with CVS but seems to be a bit of an obscure function, as there are not a lot of posts about it online. Therefore I am going to post about it here.

Here are the steps, I use, to do a retroactive branch in CVS.

1) Checkout using the tag for the previous revision that you want to start
a branch from.

for example:

    cvs co -r MY_TAG_V01-1 com/example/myapp/build.xml

2) Find out the date according to CVS for the tagged revision
I use “cvs log” to see the date of the tagged revision I wanted

for example:

  cd com/example/myapp/
  cvs log build.xml

In the result I see that my revision for this tag was 1.4 and the date was:
1970/01/01 01:00:00 (this is UTC I assume)

If your doing a bunch of files, then find the most recent date and use that. You can find a date that is older than the most recent date in the files you want to retro branch, and not as recent as any next revisions and use that.
3) Add a branch tag based on the date
an example, for one file:

    cvs tag -D "1970-01-01 01:00:00+00" -b branch_tag build.xml

or for many files:

    cvs tag -D "1970-01-01 01:00:00+00" -b branch_tag build.xml

That is it. Whala. In your favorite cvs app you should see your a new revision history for your source tree. I use ViewCVS and after doing these steps on a file in the source tree, then navigating to that tree and using the "Show files using tag" select control, I can see my "branch_tag" tag and chosing that tag to view by, I see a whole new revision history for my test file.

cvs update -r branch_tag updates only the new revision with the sub-revision

editing the source just given a new branch and commiting again creates new sub-revisions,

for example:

cvs commit
cvs commit: Examining .
Checking in build.xml;
/var/lib/cvsroot/com/example/myapp/build.xml,v  <--  build.xml
new revision: 1.4.4.1; previous revision: 1.4
done

Apache XMLBeans – output XML without a namespace

This is again something that I need to know how to do but never remember how when time it tight.

Apache XMLBeans http://xmlbeans.apache.org/ is a great tool for working with XML in Java, but it requires the XML Schema being used to create the objects from XML to have a Namespace. The namespace ends up being part of the package structure for the Objects created and I guess having a unique path for these is a good idea. However, this requirement can be a bit of a pain, when working with some simple XML structures that do not have a namespace. Especially when your ready to persist the XML Bean objects to XML source. If a namespace is added to your XML Schema that the XML Bean objects are created from, XML source generated from them will by default also have the namespace. I can never remember how to output the XML source without a namespace and so I am writing it down here where I can get at it with a click, and maybe this will help someone else as well.

There are two key steps to remove the Namespace when outputting XML Source.

  1. Tell it to use the default namespace:
           xmlops.setUseDefaultNamespace();
       
  2. Tell it that you have already declared the default namespace:
           dnsMap.put("", "http://example.com/schemas/DefaultNnameSpace");
           xmlops.setSaveImplicitNamespaces(dnsMap);
      

After this, you can output as normal:

      xmlops.setSavePrettyPrint();
      xmlops.setSaveNamespacesFirst();
      retString = myXMLDoc.xmlText(xmlops);
      return retString;

Note: the research and testing to solve this was done using XMLBeans v2.4.0

Add a KDE like "Open Terminal here" command to Mac Finder

2011 edit

As of Mac OS X 10.7 (Lion),  this can now be done as a Service (not exactly like KDE, but better than nothing). Go to:

System Preferences > Keyboard > Keyboard Shortcuts > Services

Then check the box to enable “New Terminal at Folder”

Then when you are in the Finder and want to open a terminal right there, right click on the folder name and at the bottom of the menu you will see the “New Terminal” link. You can also  drag the folder and drop it onto the terminal icon.

————– 2009 post ————

Before OSX came along, making the Mac usable, I was always mainly a Linux user and became very used to the KDE and Gnome UI and features. When getting into the OSX, I really missed certain things and just had to figure out how to add em, to be comfortable. This is one big one. Though I usually use a Shell and Midnight Commander to get around in my Linux systems, from time to time I would end up in the KDE GUI File browser thing, and end up using their cool “Open a terminal here” command, to get to the Shell and do stuff. Since we are mostly in a GUI environment when using OSX, this feature was killing me. But, there is a simple way to add this thanks to the Automator and Apple Script. Here is what I found (though I do not remember where) when looking at how to do this.

  • Launch Automator
  • Create a new Workflow (or a new Service if using Snow Leopard)
  • (If in Snow Leopard, at the top of choose “Folders” from the first select control, and “Finder” from the second)
  • Choose Finder, then drag “Get Selected Finder Items” to the work area (under “Files and Folders” in Snow Leopard or use the search feature to search for “Get Selected Finder Items”)
  • Choose Applications, then drag “Run Apple Script” toe the work area below the Finder action you just added.(in Utilities in Snow Leopard, or you can use the search feature to search for “Run Apple Script”)
  • Replace the default Apple Script it generates for you, with the following
    on run {input, parameters}
    
    	tell application "Finder"
    		set winOne to window 1
    		set winOnePath to (quoted form of POSIX path of (target of winOne as alias))
    		tell application "Terminal"
    			activate
    			tell window 1
    				do script "cd " & winOnePath
    			end tell
    		end tell
    	end tell
    
    	return input
    end run
  • Then from the Automator menu, choose “File”, then “Save As”. If not Snow Leopard then you have a workflow and need to save it to the place the OS can pick it up, so if not already defaulting to this location, navigate to:
    [user home]/Library/Workflows/Applications/Finder

    and name the file something that you want the Menu to show when you go to use this new Automator command (I used “term-here” for instance)

    But if you are in Snow Leopard, then you have a service and “Save-As” only prompts you for a name. I entered “term-here”.

Once these steps are completed and the file saved, you should be able to open the Finder, navigate to a directory you wish to open a Terminal in, right click in the directory, choose “Automator” and see you new command there. Choosing the command should pop up a new Terminal with the working directory set to the directory you were in.

NOTE, of course if you are on Snow Leopard it is different. Here you need to choose the dir you want with the pointer and right click. To get the current directory, I choose to view the path bar from the Finder’s View menu, and then I can choose the dir I am in and click. I also created a keyboard shortcut for it. To create a keyboard shortcut, when in the Finder, choose Services from the Finder menu, then Services Preferences, then scroll down to find your new service. Click into the white space to the right of your service to get the entry box for your shortcut, and then use the keys you want to be the shortcut, as if you were trying to launch it right now, and it will store them in the box for you.

Enjoy

j

Great site for searching for News Group posts (e.g. looking for tech solutions in groups)

While searching for support for an open source library I am using, I came across this neat site called Mark Mail

http://markmail.org/

It is a kind of search engine for new groups, or forums etc. Their own “about” page describes the site in short as”

MarkMail is a free service for searching mailing list archives

Anyway, it has a nice way of showing a complete thread for a topic, which is the main reason I like it. Give it a shot next time your looking for support threads.

j

set up a Debian Linux machine to handle UTF-8 in a shell or console app

To set up a Debian Linux machine to handle UTF-8 in a shell or console app do the following.

First, use dselect or whatever tool you like to find the Japanese font packages for X and install em.

Then run

  dpkg-reconfigure locales

then choose en_US.UTF-8

Test by executing the folowing in a shell:

  locale charmap

it should say

UTF-8

If not try just

  locale

it should have UTF-8 for everything like:

LANG=en_US.UTF-8
LANGUAGE=en_US:en_GB:en
LC_CTYPE=”en_US.UTF-8″
LC_NUMERIC=”en_US.UTF-8″
LC_TIME=”en_US.UTF-8″
LC_COLLATE=”en_US.UTF-8″
LC_MONETARY=”en_US.UTF-8″
LC_MESSAGES=”en_US.UTF-8″
LC_PAPER=”en_US.UTF-8″
LC_NAME=”en_US.UTF-8″
LC_ADDRESS=”en_US.UTF-8″
LC_TELEPHONE=”en_US.UTF-8″
LC_MEASUREMENT=”en_US.UTF-8″
LC_IDENTIFICATION=”en_US.UTF-8″
LC_ALL=en_US.UTF-8

if not then Add the follwoing to .bashrc and re-source it (e.g. get a new login shell, or execute bash)

  export LC_ALL="en_US.UTF-8"

then try

locale charmap

or

 locale

once that is all set to UTF-8 then change your shells (xterm, rxvt) to use:
uxterm
urxvt

That is it. I had to exit X11 and re-login to get X11 to take these settings
so that clicking my icon for xterm launched uxterm WITH the correct environment

After this all console apps that can handle UTF-8 (like vim) display UTF-8
characters correctly.

How to kill Dashboard (Macintosh OSX)

I do not use the Dashboard on my Macbook, however from time to time I start it accidentally and then have it running, and using resources. Here is one way to kill it. I constantly forget how to do this and then have to look it up. As with my last post, now it is here and hopefully easier to look find for me (and maybe others).

To kill the Dashboard, open a Terminal and then run the following commands:

   defaults write com.apple.dashboard mcx-disabled -boolean YES;killall Dock

NOTE: the second command in this list of commands, will restart the Dock, so dont be alarmed if it disappears for a moment. It will come right back.

j

Declare and fill a multi dimensional array in Java

This is one of those super simple things I wont use for ever and ever and then when it comes time to use it, I can never remember and have to look at the API again. It makes me angry that I don’t remember it and so it is here now and I (and others maybe) can remember with a click.  

Using a Java String array as an example, the declaration looks like:

 

String[][] myStr = {
{"col-0-row-0","col-1-row-0"},
{"col-0-row-0","col-1-row-0"}
};

Well lets start then

Hello world. Well there is a start. Everything must start with that right? 

 

Okay, the purpose of this post is to tell you all why I have created this new blog. I have previously mingled my Family and hobbies (Art, Music, Bicycles) with tech talk and have decided to separate the two. As you can see from the title of this blog, this is not going to be the Hobby site. 

 

I hope you stop by from time to time to see what I have to contribute today, tomorrow and days++

 

j

Follow

Get every new post delivered to your Inbox.