Virtual Tarball - Draft 3

by Jon Davis 3. March 2008 03:16

I finally got around to kicking the tires of my "virtual tarball" idea, which is basically an XML document that consists of HTML-renderable <ul> / <li> tags that describe the contents of an Internet-based directory structure. This allows a single URL to be used to fetch an entire set of files by using a single list of hyperlinks.

I prototyped this on the server side at cachefile.net using a REST-like approach. Basically, one would simply need to append the path of a known cached directory at cachefile.net to the following URL:

http://cachefile.net/svc/mrr/ [+ known path from root]

For example:

http://cachefile.net/svc/mrr/scripts/OpenAjax/

This would output the contents of the directory at http://cachefile.net/scripts/OpenAjax/ in XML / <li> format, with hyperlinks.

 

<div class="mrr">
    <label>
        Index of <span class="mrrbase">http://cachefile.net/scripts/OpenAjax/</span>
    </label>
    <ul class="mrrparent">
        <li>
            <a href="../">Parent</a>
        </li>
    </ul>
    <!--
        This is a mrr ("mirror") file, also a.k.a. a "virtual tarball".
        For more information, see
http://www.jondavis.net/blog/?tag=/virtual%20tarball
        -->
    <ul class="mrrdirlist">
        <li class="mrrdir">
            <a href="hub">hub</a>
            <ul class="mrrdirlist">
                <li class="mrrdir">
                    <a href="hub/0.6">0.6</a>
                    <ul class="mrrdirlist">
                        <li class="mrrdir">
                            <a href="hub/0.6/release">release</a>
                            <ul class="mrrdirlist">
                                <li class="mrrfile">
                                    <a href="hub/0.6/release/OpenAjax.js">OpenAjax.js</a>
                                </li>
                            </ul>
                        </li>
                        <li class="mrrdir">
                            <a href="hub/0.6/src">src</a>
                            <ul class="mrrdirlist">
                                <li class="mrrfile">
                                    <a href="hub/0.6/src/OpenAjax.js">OpenAjax.js</a>
                                </li>
                            </ul>
                        </li>
                        <li class="mrrdir">
                            <a href="hub/0.6/testsrc">testsrc</a>
                            <ul class="mrrdirlist">
                                <li class="mrrfile">
                                    <a href="hub/0.6/testsrc/TestSuite.html">TestSuite.html</a>
                                </li>
 
                              <!-- .... -->

                            </ul>
                        </li>
                        <li class="mrrfile">
                            <a href="hub/0.6/build.xml">build.xml</a>
                        </li>
                        <li class="mrrfile">
                            <a href="hub/0.6/index.html">index.html</a>
                        </li>
                    </ul>
                </li>
                <li class="mrrdir">
                    <a href="hub/1.0_build117">1.0_build117</a>
                    <ul class="mrrdirlist">
                        <li class="mrrdir">
                            <a href="hub/1.0_build117/release">release</a>
                            <ul class="mrrdirlist">
                                <li class="mrrfile">
                                    <a href="hub/1.0_build117/release/OpenAjax.js">OpenAjax.js</a>
                                </li>
                            </ul>
                        </li>
                        <li class="mrrdir">
                            <a href="hub/1.0_build117/src">src</a>
                            <ul class="mrrdirlist">
                                <li class="mrrfile">
                                    <a href="hub/1.0_build117/src/OpenAjax.js">OpenAjax.js</a>
                                </li>
                            </ul>
                        </li>
                        <li class="mrrdir">
                            <a href="hub/1.0_build117/testsrc">testsrc</a>
                            <ul class="mrrdirlist">
                                <li class="mrrfile">
                                    <a href="hub/1.0_build117/testsrc/TestSuite.html">TestSuite.html</a>
                                </li>
                                  <!-- ... -->
                            </ul>
                        </li>
                        <li class="mrrfile">
                            <a href="hub/1.0_build117/build.xml">build.xml</a>
                        </li>
                        <li class="mrrfile">
                            <a href="hub/1.0_build117/index.html">index.html</a>
                        </li>
                    </ul>
                </li>
                <li class="mrrfile">
                    <a href="hub/home.href">home.href</a>
                </li>
            </ul>
        </li>
        <li class="mrrfile">
            <a href="home.href">home.href</a>
        </li>
    </ul>
</div>

As an added bonus, you can also get HTML wrapping of the XML file by appending the querystring, "?format=html".

http://cachefile.net/svc/mrr/scripts/OpenAjax/?format=html (click to view) 

You can let your imagination take you wherever you want to go as to what you would do with such a tool from here. I'm opening the uncommented server-side source code for this. The PHP file for my proprietary implementation is here: http://www.jondavis.net/misc/cachefile_mrr_gen.txt

Unfortunately, I have a sinking feeling that this opens up security vulnerabilities. If anyone can spot any, please let me know. I already filter out "..". 

kick it on DotNetKicks.com

The Virtual Tarball - Second Draft

by Jon Davis 22. November 2007 17:50

In prototyping yesterday's blogged idea with .vtb / .mrr files, I've run into some design flaws with the proposed "schema". Main problem among them is that directory structures are typically not described in flat lists but as <ul> trees. A local file name should not be described as

<li>dir</li>
<li>dir/subdir</li>
<li>dir/subdir/file.ext</li>

.. but rather as..

<ul class="mrrdir">
  <li>dir
     <ul>
       <li class="mrrdir">subdir
       <ul>
          <li class="mrrfile">file</li>
       </ul></li>
     </ul>
   </li>
 </ul>

This makes more sense because it when it is rendered in HTML it is more legible and maintainable in the DOM.

  • dir
    • subdir
      • file.ext

Imagine if this was "fuzzy" and not strict. If the filename could be "subsubdir/file.ext", or worse "C:/windows/system32/file.ext", you run into all sorts of problems trying to target the download destination path. Directory seperators are completely disallowed, then, in the text value of the file's <li> entity.

This changes the programming a bit on the Windows app side, in both easier and more difficult ways. It becomes easier to manage the directories, but now the files' download names have to be managed within the directories virtually. Note that by "difficult" I mean a few extra minutes, not a few extra hours; on the other hand, thinking this through, I've already lost a few hours and decided to start over in my code while it's still a brand new and barely written prototype codebase.

Meanwhile, the href value must assume that the base URI is always the base URI for the entire document, not for the listed directory.

Here's a proposed valid sample .mrr doc, where the base URI is: http://cachefile.net/  

<ul class="mrr">
 <li class="mrrdir">
  <a href="scripts">scripts</a>
  <ul class="mrrdir">
   <li class="mrrdir">
    <a href="scripts/jquery">jquery</a>
    <ul class="mrrdir">
     <li class="mrrdir">
      <a href="scripts/jquery/1.2.1">1.2.1</a>
      <ul class="mrrdir">
       <li class="mrrfile">
        <a href="scripts/jquery/1.2.1/jquery-1.2.1.js">jquery-1.2.1.js</a>
       </li>
       <li class="mrrfile">
        <a href="scripts/jquery/1.2.1/jquery-1.2.1.min.js">jquery-1.2.1.js</a>
       </li>
       <li class="mrrfile">
        <a href="scripts/jquery/1.2.1/jquery-1.2.1.pack.js">jquery-1.2.1.pack.js</a>
       </li>
      </ul>
     </li>
    </ul>
   </li>
  </ul>
 </li> 
</ul>

Rendered in plain HTML:

I'll update this post with a revised Windows app (C#) prototype soon.

The Virtual Tarball

by Jon Davis 21. November 2007 16:11

AFAIK, no one has done this, at least not in this specific way, I have a need for it, and I can see it being used everywhere. So I'm proposing it, and I'm going to implement it.

My idea: The virtual tarball. (Or something?) A file extension of something like .vtb, or .mrr (mirror file). Inside, it looks like it's just an XML file with XHTML-renderable hyperlinks, but the file type is used by an executable that pulls the files down into the specified directory with the <a> tags' text as the save-to file name.

Example contents:

<ul class="mrr">
 <li class="mrrfile">
  <a
 href="
http://cachefile.net/file_a.bin">file_a.bin</a>
 </li>
 <li class="mrrfile">
  <a
 href="
http://cachefile.net/file_b.bin">file_b.bin</a>
 </li>
 <li class="mrrdir">dir1</li>
 <li class="mrrfile">
  <a
 href="
http://cachefile.net/dir1/file_c.bin">dir1/file_c.bin</a>
 </li>
 <li class="mrrfile-alternate">
  <a
 href="
http://otherurl.net/dir1/file_c.bin">dir1/file_c.bin</a>
 </li>
</ul>

Given this sample, here's what Visual Studio outputted as an XML Schema file from automatic conversion:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified"
 elementFormDefault="qualified"
 xmlns:xs="
http://www.w3.org/2001/XMLSchema">
 <xs:element name="ul">
  <xs:complexType>
   <xs:sequence>
    <xs:element maxOccurs="unbounded" name="li">
     <xs:complexType mixed="true">
      <xs:sequence minOccurs="0">
       <xs:element name="a">
        <xs:complexType>
         <xs:simpleContent>
          <xs:extension base="xs:string">
           <xs:attribute name="href"
             type="xs:string" use="required" />
          </xs:extension>
         </xs:simpleContent>
        </xs:complexType>
       </xs:element>
      </xs:sequence>
      <xs:attribute name="class" type="xs:string" use="required" />
     </xs:complexType>
    </xs:element>
   </xs:sequence>
   <xs:attribute name="class" type="xs:string" use="required" />
  </xs:complexType>
 </xs:element>
</xs:schema>

The point of this is that it would look like HTML but it could be processed like a .zip file. Only difference between a .mrr file and a .zip file, other than the fact that a .zip file is compressed and isn't human-readable when introspected, is that a .zip contains the contents, whereas a .mrr file only contains hyperlinks to the downloadable files. In the above example, I also have an "-alternate" class so that the processor can see that as a mirrored repository for the same file.

Oh, and yeah, the point of the XHTML compatibility is partly for inspection and previewing, but also for Javascript DOM support. I'm thinking this could be my "engine" for a web browser script library pre-loader page idea I have for adding as a new feature for cachefile.net.

I'm going to get to work on an open source C# console application for Windows, as well as a Javascript browser caching implementation.

Update: I've spent most of the night prototyping the C# app. I'm calling it Mrrki ("murky"), and settled on .mrr (for "mirror"). Here's my first rough draft build: http://www.jondavis.net/codeprojects/Mrrki/0.1/Mrrki.zip.

kick it on DotNetKicks.com

 

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen

About the author

Jon Davis (aka "stimpy77") has been a programmer, developer, and consultant for web and Windows software solutions professionally since 1997, with experience ranging from OS and hardware support to DHTML programming to IIS/ASP web apps to Java network programming to Visual Basic applications to C# desktop apps.
 
Software in all forms is also his sole hobby, whether playing PC games or tinkering with programming them. "I was playing Defender on the Commodore 64," he reminisces, "when I decided at the age of 12 or so that I want to be a computer programmer when I grow up."

Jon was previously employed as a senior .NET developer at a very well-known Internet services company whom you're more likely than not to have directly done business with. However, this blog and all of jondavis.net have no affiliation with, and are not representative of, his former employer in any way.

Contact Me 


Tag cloud

Calendar

<<  November 2018  >>
MoTuWeThFrSaSu
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789

View posts in large calendar