Update on my stuff - June 6, 2008

by Jon Davis 7. June 2008 14:29

OK it's been a while since I did any real blogging lately, here's a sum-up:

  • I've still forgotten how to blog. While this is my blog and I can post whatever I want, this blog has become a high-level geek toy discovery dump rather than a software development blog. While I make no apologies for this--I can do whatever I want here--I'm trying to figure out if I should fork my blogs, one for serious software development discoveries and one for shallow geek discoveries. I've already been wishing I could add a personal blog somewhere, not here, and post thoughts and feelings about non-tech stuff. (This is something I used to do but it got carried away so I boycotted it and focused on purely geek speak here. I miss doing personal blogging though, and I feel like it would be good for me to do it again, if I'm just a bit more careful.)
     
  • I keep mentioning the LION Search Service, how we intend to open source it, and how it's been cancelled, and then I deleted that cancellation post, etc. It keeps coming back and going away and coming back again. Well part of that is just the reality of how it's being treated at the office; open sourcing LION required all-or-nothing support on the part of my teammates at the office, and I was getting really pointless "I'm building my own search service instead of yours but I'm gleaning ideas from yours" feedback from others on the team. Lately, though, they've discovered that what they were doing was accomplishing the same tasks as what I had already implemented, and in the absence of time availability they reverted and have adopted LION. So LION is back, the team has adopted it, and we're about to go live with our first faceted-searching web site using LION (a couple other web sites were already using it but without the new faceted searching). It's now a lot more official, LION is our exciting new search platform going forward.

    So,
    • LION is a nearly-sort-of-enterprise-class search engine, written in C#, currently in development, built on Lucene.NET, and its continued development is being inspired in part by the Apache Solr project (but was conceived and put into production before Solr was discovered).
      • It currently offers
        • rich, paginated queries with field and index selection,
        • Lucene.NET break-neck speeds,
        • faceted searching,
        • strong document modeling, and
        • dynamic updates (currently limited to index A/B switching but that will be improved soon).
      • It currently runs on WCF (.NET 3.0). Not BizTalk or anything otherwise weird or expensive to support. It currently does not have, but will soon have, REST and AJAX support.
      • It does not have and might not ever have the high availability "server farm" feature set that I believe Solr offers. Service stability and uptime has been a constant thorn in our sides, so rethinking and possibly rewriting portions of the service itself is going to be necessary before we put it out there for the world to consume in open source.
      • It is currently not, but soon will be, built around IIS 7. Right now it's just a standalone console application, wrapped in a Windows service.
    • We will still be open-sourcing LION, eventually. Right now it's looking like a late-summer or fall time frame.
    • LION will be "internally sourced" first, to be shared to other departments in our large and disperate company, to be evaluated and discovered first. (I think this is very wise, just something I didn't think about until the boss said to do it that way.)
    • LION has been made a team-shared ownership; since I built it on the job, there's little I can do to stop the boss from handing it to a certain other team member and telling him to own it, which would be totally unsurprising. While I wouldn't agree with the situation (I've only put my heart and soul into LION, and I'm also the oldest and most experienced person on the team, even if I have a couple areas of technical weaknesses), a situation like that is not something I can change without throwing the baby out with the bathwater, and I'll live.
        
  • I've been thinking about engaging in one or more side projects I would wholly and independently control. Given the big ideas I have, I'm still not sure I want to do one, though, as it's all a lot of work, some of it very tempting to try to bring into the workplace, and I don't want my authortitative work to be disrespected in the workplace like LION has been. Even so, the ideas are still on the table for consideration
      
    • BlogObjects - I registered BlogObjects.net back when I registered PowerBlog.net (I wrote in VB6 and rewrote in C# a commercial blogging desktop client application about five years or so ago, with zero commercial success but significant personal/professional technical growth for myself) and was considering making PowerBlog's development API a set of open-source blogging tools. I'm thoroughly persuaded that this will never be commercially viable as blogging APIs are a dime a dozen, but, like PowerBlog was five years ago, starting from scratch on a client/server blogging API would be a good refresher on technology in general. This is something that, if I do it, it will be not only open source, it will also be cross-platfom to Mac and to Linux, and will also be complemented with a PowerBlog resurrection or else PowerBlog-like alternative with fewer feature (no Active Scripting [VBScript, Javascript], no team synchronization support).
        
    • CMSObjects - Taking BlogObjects to another level, CMSObjects would build upon BlogObjects's idea (but not the codebase) and add:
      • An enterprise class CMS service
      • Rich support for the Atom Publishing standard API
      • Content typing (for example, a "Story" article type, a "Video browser", a "Photo gallery", a "Product Detail" page, etc), with scripted or compiled coded definitions that support inheritence
      • Rich templating both per content type and per content instance
      • Workflows
        • Team member security (user can post, user can only read, user can only post to a particular category or to a particular URL prefix, etc.)
        • Team members' posts can be flagged for approval, and then approved by a senior editor
        • Content can be annotated, seperately from the publshed output -- ideally, the annotations can be made inline, directly inside of the content
        • Content publishing to the web can be postponed to a digital publish date
      • Content versioning 
      • Again, completely open-sourced, but more web service oriented, possibly web-driven, and much less GUI-driven (so no PowerBlog for CMSObjects except for basic blogging to a CMSObjects service)
          
    • CRMObjects - Yeah the "Objects" suffix is getting a little silly, but my short experience with Sage Software's SalesLogix exposed me to another facet of software and business technology that reflects the genuine needs of businesses in general, which is using technology to support sales and support staff. CRM can be painfully messy and awkward, but tossing together some basic building blocks might be a fun and rewarding experience, even if it is not commercially viable (way too much competition, as with blogging). Who knows, though, businesses might actually use it, and supporting it could actually become lucrative.
        
    • ERPObjects - I'll do that by myself, in my spare time. Just kidding.
        
  • I have a book here on Cocoa programming. I'm curious about Objective-C, how it's a weakly typed, object-oriented C language superset, which just sounds weird (C being object oriented and weakly typed?! .. weird! .. but cool!!), I can see why the book comes right out and says that it can be extremely dangerous, I saw enough of this with VB6 programming's suport for weak typing and evil Variants in my past, but keeping in line with C linguistics and C power, I'm very curious about it. I'm guessing that Microsoft's answer to Objective-C is C++/CLI, but the CLI in itself is still strongy typed, so you have to look at integration points with JScript.NET or something (but then there's Lua, Python, etc., which the non-Microsoft community also has). I watched an introduction to Objective-C video amongst the iPhone SDK videos recently and, while it does look different, I can see how it as been used to give Mac and iPhone developers a lot of shortcuts into fast and efficient software development.
      
  • Over the last two or three weeks I've become less and less excited about cross-platform development libraries and APIs, such as Mono, Java, wxWidgets, Qt4, and SDL, for a few reasons:
    • While cross-platform GUI APIs pretty actually do work across platforms like they promise, their end output is a bit less predictable and/or desirable than one might expect.
      • Choosing a GUI API that uses its own UI rendering results in software that can be predicted by the developer, but it is not predictable by the user in the context of other software. Java Swing apps traditionally demonstrated this a long while back with is blueish, proprietary look and feel, and that's not what users want. Users want software that looks and feel like the software already installed on their systems. Sometimes these cross-platform APIs, like Java, come bundled with visual schemes that simulate these operating systems, but they are clearly faked and do not take advantage of the rendering APIs already offered by the core operating system.
      • Choosing a GUI API that uses the operating system's rendering and layouts (such as how the Mac uses the Aqua scheme but also puts the main menu up on the top of the screen) does not always result in predictable rendering output. Sometimes, for example, the borders or clickable "handles" of a drop-down list sticks out further to the right, above, or below the dimensions specified by the developer, or else the text rendering inside becomes unreadable because the internal padding or inner/inset "border" that contains the content is too small on one operating system while on the developer's operating system it appears fine. Sometimes there is somehow a mismatch of pixels to DPI, and the GUI code is using one unit of measure while the OS has adopted another, and so some "defensive layout programming" needed to have been written but wasn't because the developer's operating system didn't have this or that feature. And so on.
    • Java and C# are both still too slow for my liking, and other non-native languages are not mainstream enough to feel confident in building around with the knowledge that others can help carry the torch.
    • I've found that some cross-platform software applications that use generically cross-platform dependencies such as SDL are actually quite unstable. This doesn't mean that SDL et al are unstable, it just means that it is too easy for developers to build upon such a framework and not hash out platform-specific causes to application failures. There is no such thing as a silver bullet, SDL/wx/Qt/etc notwithstanding.
    • Ultimately, C and C++ (but not Objective-C) are themselves cross-platform languages, with platform-specific dependencies (and, of course, CPU-specific machine code compilation). The best thing to do, I think, is to get a handle on building C++ applications for one platform, but build things out generically enough so that platform-specific dependencis are broken off into libraries. Then, when porting the applications, you only need to port the platform libraries. This is what most cross-platform software I've seen ends up doing when they also take advantage of the features of the operating system like DirectX.
    • Another approach might be to build upon a cross-platform API first, or perhaps even a cross-platform non-native language like C#, and then port each component, piece by piece, to C++ on behalf of the native patform.
    • None of this is to say that I think wx, Qt, SDL, et al, are worthless or that I won't use them on a regular basis (if indeed I can get myself coding C/C++ apps to begin with), I'm just trying to say that there is no silver bullet and use of these libraries, rather than native Cocoa (for Mac) or MFC (et al, for Windows), would be a slight compromise for time and resources, not a magical, perfect answer for excellent software that "just works".
        
  • At the end of the day, I'm still scratching the surface and barely finding time enough to sleep, much less write the software I'm envisioning. Ugh.

Currently rated 1.3 by 3 people

  • Currently 1.333333/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , , , ,

Pet Projects | Software Development

Faceted Searching with Lucene.net and LION

by Jon Davis 6. May 2008 21:06

Two or three months ago I noted that I was planning on adding a nifty new QueryBuilder object as well as add new faceted searching support.

http://www.jondavis.net/blog/post/2008/02/LION-Searching%2c-AJAX-Coding-
With-jQuery-and-ASMX%2c-Jaxer-Tinkering%2c-and-COD4-Gaming.aspx

Um, well, we got sidetracked. But the need continued to haunt us, and a co-worker stumbled upon this:

http://markmail.org/message/hyetqvex7bqnshek#query:facet%20search%20
lucene.net+page:1+mid:zrew4dimoktd6vex+state:results

.. which included a sample source file (SimpleFacets.cs) that proved to be a workable solution for us. (Wow. That was easy.)

I have reached the next major milestone for our project for

  • my pretty QueryBuilder object
  • faceted searching on a particular set of fields (my query can optionally specify specific values for facet counting rather than rely on the top scrape)
  • faceted searching on a named subquery (likely very slow [untested] but very powerful)

.. and that milestone is writing the code and making sure that the old application code still runs! I haven't even tested it yet. No, I didn't do TDD, shame, shame, shame!! Problem is, there's just so much code already that is not already bound to TDD unit tests that to refactor for NUnit or VSTS I'll have to really dedicate some time to make it worthwhile to form up a pattern .. and time is something I'm short on here. Then again, I don't have enough time NOT to write tests before writing code, so I'm feeling pretty lame right now.

That said, I did ask my boss a couple months ago if it was okay with him if I could open source LION (Lucene Indexing On .NET), he doesn't have a problem with it, and I'd like more hands in the pot. Everyone at the office is just too dang busy, and when they see it all they can say is, "Wow, that's a lot of code, I thought it was just Lucene.net API calls, you have WCF providers and query wrappers and document structure base classes and everything..." They don't exactly say that with appreciation; they want small chunks of throw-away code scattered everywhere. I have tried to make LION support that, too, but I don't write my implementation code that way so that's not what they see when I show it to them.

Anyway, now that faceting is partially implemented (fully written but untested, bleah) after I write some tests up and debug and everything works as it should I'm going to fork off the core projects and refactor and rename things and then open-source the whole thing. I'm trying to go in the direction of Lucene Solr on this in terms of front-end feature set, but there's a lot of re-thinking and refactoring to do on the service side to ever call it "enterprise class", our needs have been mostly met but others might be able to contribute to the project to help it meet their own custom needs (and hopefully with minimal tweaks).

Currently rated 1.2 by 5 people

  • Currently 1.2/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , ,


 

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen

About the author

Jon Davis (aka "stimpy77") has been a programmer, developer, and consultant for web and Windows software solutions professionally since 1997, with experience ranging from OS and hardware support to DHTML programming to IIS/ASP web apps to Java network programming to Visual Basic applications to C# desktop apps.
 
Software in all forms is also his sole hobby, whether playing PC games or tinkering with programming them. "I was playing Defender on the Commodore 64," he reminisces, "when I decided at the age of 12 or so that I want to be a computer programmer when I grow up."

Jon was previously employed as a senior .NET developer at a very well-known Internet services company whom you're more likely than not to have directly done business with. However, this blog and all of jondavis.net have no affiliation with, and are not representative of, his former employer in any way.

Contact Me 


Tag cloud

Calendar

<<  September 2018  >>
MoTuWeThFrSaSu
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567

View posts in large calendar