+1-888-365-2779
Try Now
More in this section
Categories
Bloggers
Blogs RSS feed

Using the BETA Fluent API to import blog posts from an RSS feed

by Gabe Sumner

As a side project, I’m experimenting with moving Sitefinity Watch to the Sitefinity 4.0 BETA.  As of this writing, there are no migration tools that can be used to migrate blog posts from Sitefinity 3.x to Sitefinity 4.0.  We will eventually deliver these migration tools, but this project gave me an excuse to experiment with using Sitefinity 4.0’s Fluent API to import content.

Ivan Osmak highlighted the Fluent API in our Sitefinity 4.0 BETA webinar.  The Fluent API enables .NET developers to query and manipulate Sitefinity data in very powerful ways. 

Disclaimer:  What I’m going to share in this blog post needs a lot of work.  I’m not yet done with it.  Later I’ll explain the issues I still need to solve.  However, rather than sit on this code for weeks, I wanted to share what I have working.  Later I can iterate.  Alternately, if someone wants to steal this code and extend it…feel free.  I will gladly link to your blog posts and credit your work.

My Very Basic Goal

As a first step, I simply wanted to point to an RSS feed and import these RSS items into a Sitefinity 4.0 blog.  This obviously isn’t a full fledged import utility, but it’s a reasonable first step. 

Step 1: Create a Blog

Before importing blog posts, I needed to first create a new blog in Sitefinity 4.0. 

Creating a new blog in Sitefinity 4.0

This was the easy part.

Step 2: Open and prepare the Sitefinity project

There are some assembly references that needed added before custom code can be added to a Sitefinity project.  This instructions (and links) below describe how to prepare a Sitefinity project for custom development.

1.  Open the Sitefinity project in Visual Studio

Opening a Sitefinity 4.0 project in Visual Studio

2.  Add the missing assembly references.  <- Important!!

Adding a new reference to a Sitefinity 4.0 web application project

3.  Add an assembly reference to System.Linq.Xml

Adding a reference to System.Xml.Linq

The project is now prepared for custom development.

Step 3:  Create a Blog Import ASPX page

To import blog posts, I needed to run a small bit of code within the context of my Sitefinity web site.  There are various strategies for doing this (for example, I could create a custom UserControl or use Sitefinity Services), but I chose to create a new ~/blogimport.aspx page in my Sitefinity web site.

The instructions below describe how to create this ASPX page:

1.  Right-click the SitefinityWebApp project in the Solution Explorer and click New Item

2.  Select Web Form and type blogimport.aspx for the Name

 Adding a new Web Form item to a Web Application Project

3.  Open the code-behind file for the blogimport.aspx page

The codebehind file for the Blog Import ASPX page. 

Step 4:  Create the code

The code for fetching the RSS items and importing these items into Sitefinity (using the Fluent API) is posted below.  Comments have been added throughout the code to explain what is happening.

~/blogimport.aspx.cs

using System;
using System.Linq;
using Telerik.Sitefinity;
using System.Xml.Linq;

namespace SitefinityWebApp
{
    public partial class blogimport : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            // Get the RSS feed from Sitefinity Watch
            var blogRSS = XDocument.Load("http://feeds.feedburner.com/SitefinityWatch");

            // Get the RSS items from the feed
            var rssItems = from d in blogRSS.Descendants("item")
                           select d;

            // If the executing code isn't wrapped in a using, then changes aren't committed
            using (var fluent = App.WorkWith())
            {
                // Get the blog object associated with my Sitefinity Watch blog
                var blog = (from b in fluent.Blogs().Get()
                            where b.Title == "Sitefinity Watch"
                            select b).FirstOrDefault();

                // Print the ID, to confirm I got it
                Response.Write(blog.Id.ToString() + "<br />");

                // Loop through each RSS item
                foreach (var item in rssItems.ToList())
                {
                    // Print the Title for each blog post
                    Response.Write(item.Element("title").Value + "<br />");

                    // Add the blog post from the RSS feed to my Sitefinity 4.0 blog
                    blog.BlogPosts.Add(
                        fluent.BlogPost()
                        .CreateNew()
                        .Do(b =>
                        {
                            b.Title = item.Element("title").Value;
                            b.Content = item.Element("description").Value;
                            b.PublicationDate = DateTime.Now;
                            b.ExpirationDate = DateTime.Now.AddDays(10);
                        })
                        .Get()
                    );
                }
            }
        }
    }
}

Step 5: Import the RSS blog posts

The blog posts found in the RSS feed can be imported by accessing the import page in a web browser.  Because the Sitefinity Fluent API is bound by Sitefinity permissions you must be authenticated before accessing the page.

1.  Login to Sitefinity

2.  Access the ~/blogimport.aspx page in a web browser.

Opening the Blog Import ASPX page in a web browser

3.  View the blog in Sitefinity to confirm the blog posts were imported.

Blog post count in Sitefinity 4.0

Some parting words

This is a nice first step.  However, this is not a fully developed import process.  The technique described above has several challenges associated with it. 

For example:

  • Old URLs are not preserved
  • Images associated with the blog posts aren’t imported
  • Only blog posts found in the RSS feed (15 posts) are imported


These are challenges that I still need to work through.  In the meantime, this might be useful to anyone who is experimenting with Sitefinity’s Fluent API. 

Thanks to Ivan Dimitrov and Robert Shoemate for helping me with this code.

14 comments

Leave a comment
  1. Michael Russell Sep 08, 2010
    Does the context implement IDisposable?  If so, you could use a using block to ensure proper disposal.

    That said, not committing anything until the context is disposed is a really bad design decision.  There are several corner case instances, especially with web development, where the finalizer isn't guaranteed to run.
  2. Gabe Sumner Sep 08, 2010
    Hey Michael,

    I modified the code above.  Instead of directly calling fluent.Dispose(), I'm now wrapping all the code in a using() {} block.  

    If I omit calling Dispose() and remove the using{} then the changes aren't persisted.  One or the other seems to be required.  

    Other people, smarter than me, will need to explain this behavior.  
  3. Georgi Sep 08, 2010
    Yes, there is just a little note here - SaveChanges() event is implemented *on* disposing, so if you use either .dispose or using() {} statements, the changes will be persisted. Everything in Sitefinity 4.x is getting persisted after calling SaveChanges().
  4. Bob Sep 09, 2010
    Hi guys,

    Let me clarify a bit more on the fluent implementation.
     
    IDisposable is implemented for FluentSitefinity class, which is the type returned by App.WorkWith() method. This class serves as context container for the various facades that you may work with at a time. The class does not have destructor (finalizer).  The sole purpose of the IDesposable implementation is for convenience when the desired result cannot be achieved in a single method chain.
     
    Calling either SaveCahnges method on any facade in the current context (FluentSitefinity instance) or Despose method on the context itself, both will commit the current transaction. Since App.WorkWith() returns new instance of FluentSitefinty and creates new transaction, you have to keep reference to the same instance if you want to commit all changes at once. The “using” statement in C# provides us with a convenient way to clearly define the scope of the context.

    Currently there is a problem with using “using” statement which you should be aware of. Since using(IDesposable myObject = …) { … } statement is actually compiled to try { … } finally { myObject.Despose() }, any exception thrown in the using statement won’t prevent the committing of the current transaction. We are going to fix this sometime soon.
  5. Michael Russell Sep 09, 2010
    Bad move, in my opinion.

    The Fluent API seems to be trying to emulate the Entity Framework, which is a good thing because it allows people to use some of their existing skillset in Sitefinity.

    However, straying from established conventions (non-explicit commits, vague method names like "Do()", etc.) is always a bad move.  Bob's last paragraph is a large reason for the convention.
  6. Bob Sep 09, 2010

    Hi Michael,

    I'm very glad you brought up this conversation. I wish the community is much more active on discussing design decisions. I really want to hear more suggestions and opinions.

    Please consider the examples below. NOTE: these are not working examples as the fluent API is not fully implemented in the beta. It is not even exactly what you are going to see in the next release as we are currently including a lot more functionality in the API such as content lifecycle. Since we are discussing the conceptual aspect of the API, this should not matter.

    App.WorkWith()          // Defines the scope 
        .Blog()             // Switches to Blog facade  
        .CreateNew()        // Creates new Blog and sets it to the current state
        .Do(b => b.Title = "Bob's Blog") // Sets Blog properties
        .BlogPost()         // Switches to Post facade
            .CreateNew()    // Creates new post and sets it to the blog in the current state
            .Do(p =>        // Sets Post properties
            {
                p.Title = "My first Blog post";
                p.Content = "Hello World.";
            })
            .CreateNew()    // The same steps for second post
            .Do(p =>
            {
                p.Title = "My second Blog post";
                p.Content = "Hello again.";
            })
            .SaveChanges()  // Commits the transaction. The blog and the posts are persisted at the same time.
            .Done()         // Returns back to Blog facade 
        .BlogPosts()        // Switches to Posts facade (work with collections)
            .Where(p => p.DateCreated < DateTime.UtcNow) // Some query expression, in this case it will return the two posts we just created.
            .ForEach(p => p.AllowComments = true) // Sets the property for all items in the collection. 
            .SaveChanges(); // Commits the transaction.

    In the example above I demonstrate how you can do multiple things in a single method chain. Also, the transaction commit is very explicit SaveChanges() . The advantage of this approach is that the intelisense gives you hints for your next step. Furthermore, the fluent API maintains internal state and you don't have to maintain manually the relation between the objects, in our example Blog and Posts. Also it can protect you from doing inappropriate or inapplicable actions by giving you only valid options for the particular state.  The last will become very apparent when we start demoing content lifecycle management.

    Just for comparison I'm adding the example below. I does exactly the same as the previous example but using the native API.

    var manager = BlogsManager.GetManager();
    var blog = manager.CreateBlog();
    blog.Title = "Bob's Blog";
      
    var post1 = manager.CreateBlogPost();
    post1.Title = "My first Blog post";
    post1.Content = "Hello World.";
      
    var post2 = manager.CreateBlogPost();
    post2.Title = "My first Blog post";
    post2.Content = "Hello World.";
      
    blog.BlogPosts.Add(post1);
    blog.BlogPosts.Add(post2);
      
    manager.SaveChanges();
      
    var result = manager.GetBlogPosts().Where(p => p.DateCreated < DateTime.UtcNow);
    foreach (var p in result)
        p.AllowComments = true;
      
    manager.SaveChanges();

    The commit is also explicit but you have to know that you have to use BlogsManager and you have to add the posts to a blog in order to appear in the UI. When we introduce the lifecycle you will have to know a lot more then that. You will have to know how to handle versions, states, locking and etc. where all this can be hidden in the fluent API.

    Now let's see three different implementations of the example that Gabe wrote. Note: since we want to import all RSS items with one commit, we cannot use single method chain because we have to keep reference to the same fluent context in a loop.

    var blogRSS = XDocument.Load("http://feeds.feedburner.com/SitefinityWatch");
    var rssItems = from d in blogRSS.Descendants("item")
                    select d;
      
    BlogPostFacade facade = App.WorkWith()
        .Blogs()
        .Where(b => b.Title == "Sitefinity Watch")
        .First()
        .BlogPost();
      
    foreach (var rssItem in rssItems)
    {
        facade.CreateNew()
            .Do(p =>
            {
                p.Title = rssItem.Element("title").Value;
                p.Content = rssItem.Element("description").Value;
            });
    }
    facade.SaveChanges();

    As you can see in this example the commit is explicit, but it maybe confusing if you work with multiple facades in the same manner as it is not clear on witch facade you should call SaveChanges. Although it doesn't matter because all they will commit the same transaction.

    The next example just shows different way to commit a transaction. This is actually what happens behind the scenes.

    var blogRSS = XDocument.Load("http://feeds.feedburner.com/SitefinityWatch");
    var rssItems = from d in blogRSS.Descendants("item")
                    select d;
      
    // Prepares the context with default settings.
    // This method is invoked implicitly when you use directly WorkWith().
    // We need the settings to retrieve information about the current transaction.
    var settings = App.Prepare(); 
    BlogPostFacade facade = settings.WorkWith()
        .Blogs()
        .Where(b => b.Title == "Sitefinity Watch")
        .First()
        .BlogPost();
      
    foreach (var rssItem in rssItems)
    {
        facade.CreateNew()
            .Do(p =>
            {
                p.Title = rssItem.Element("title").Value;
                p.Content = rssItem.Element("description").Value;
            });
    }
    TransactionManager.CommitTransaction(settings.TransactionName);

    And the last one is with using statement.

    var blogRSS = XDocument.Load("http://feeds.feedburner.com/SitefinityWatch");
    var rssItems = from d in blogRSS.Descendants("item")
                    select d;
      
    using (var fluent = App.WorkWith())
    {
        BlogPostFacade facade = fluent.Blogs()
            .Where(b => b.Title == "Sitefinity Watch")
            .First()
            .BlogPost();
      
        foreach (var rssItem in rssItems)
        {
            facade.CreateNew()
                .Do(p =>
                {
                    p.Title = rssItem.Element("title").Value;
                    p.Content = rssItem.Element("description").Value;
                });
        }
    }

    In the last example the commit is somewhat implicit. I'm not really defending this approach, but I'm curios to hear arguments why it is bad. How would you suggest to handle the Do method?
     
    I encourage all readers to speak out. Changing APIs at later stage is really troublesome.

  7. Michael Russell Sep 10, 2010
    I understand the desire for chaining.  Ask anyone who has used jQuery about how powerful chaining can be.  However with jQuery, it's perfectly acceptable to get to a null state...a state where previous filters, etc., have gotten you down to where you are operating on an empty set of elements.  That isn't the case with Sitefinity.

    That said, I see the following pros and cons here.

    Pro: 99% of a website's time is spent reading data, and it looks like the new Fluent API makes it significantly easier and more type-safe to read data, and from what I understand about the changes to the underlying data structure, it's also going to be more efficient to read the data.  I see nothing but positives here.

    Con: Easier to lose track of the state.  In the current API, I know where I am at each step of the process.  Looking at your first example in the reply gives me a splitting headache even with the comments.

    Con: The database connection is open from beginning to end.

    Regarding the implicit commit on Dispose(), let's talk through some common ramifications.  You are using Fluent API as in your first example, but you've been told that once the object goes away, it will auto-save, so you don't bother with SaveChanges().  Midway through page execution, a Gen 0 collection occurs and your facade and transaction get bumped into Gen 1.  With only 100 pooled SQL connections in ASP.NET to work with, this may not seem bad...until you realize that Gen 1 doesn't get collected as often.  Gets worse when you start thinking about the scope of the transaction: Is it just on the blogs table?  What happens when other pages start wanting to create new blog entries?  What is the transaction scope?  You're going to start hitting deadlocks.  You're going to start running low on connections.  You're going to resource starve the site.  I've had to track down connection leaks on sites where 99% of the work was done in gen 0, but when the connection popped into gen 1, it could take up to 5 minutes before the collection would occur.  It isn't fun.

    It's also not clear when a transaction is being created.  If I'm just trying to get a list of blog posts, and I call App.WorkWith().BlogPosts(), did that create the transaction, or is it only when CreateNew() or Where() is called?  With the Entity Framework or LINQ to SQL, we know when the interaction is taking place...I'm not seeing that here yet.

    The explicit SaveChanges() gives you a good idea of what is going on: open the connection, save, and close.  At least in the current version of Sitefinity, that's also the only time when any validation is done (if only on the database side), so you get an expected time for when a failure can occur and you can structure your error handling accordingly.  Plus we're only dealing with a single potential source for errors.  In your first example, failure can occur on transaction create, on blog creation (title missing/too long/some other error), on blog item creation (title missing/too long/some other error), on save (lost database connection, some validation rule trigger caused an error, out of room in database, etc.), on blog post enumeration, and again on saving all of the records.

    I do agree that the ability to do multiple things within a single transaction is a positive.  It would have made migrating our blogs significantly easier if we could have aborted the transaction and been able to avoid the cleanup tasks on failed migrations.  This just seems like an effort to bring the wrong paradigm into the wrong area to solve the wrong problem.
  8. Lino Tadros Sep 10, 2010
    Great discussion! Thanks Michael and Bob!

    My 2 cents:
    1- I think the biggest issue is: a transaction that you can only commit and not roll back, is pointless.
    2- It would be great if the SF team could keep the concept of a transaction within their Business Objects and not even open a DB connection until it's time to commit, at which point everything would be committed at once.

    Cheers
    Lino

  9. Lino Tadros Sep 10, 2010
    Great discussion! Thanks Michael and Bob!

    My 2 cents:
    1- I think the biggest issue is: a transaction that you can only commit and not roll back, is pointless.
    2- It would be great if the SF team could keep the concept of a transaction within their Business Objects and not even open a DB connection until it's time to commit, at which point everything would be committed at once.

    Cheers
    Lino

  10. Bob Sep 10, 2010

    Michael,

    Now I understand your concerns. There is one thing we should clarify though. When I talk about transactions I don't mean database transaction.

    By default Sitefinity uses optimistic concurrency control, which is maintained by OpenAccess ORM. This means that a connection from the pool is obtained right before executing an SQL statement and it is immediately released after completion.  In this case OpenAccess maintains the transaction and the concurrency control. That's why neither deadlocks nor connection leaks can occur.

    However, your concerns are absolutely valid when using pessimistic isolation.  In case you ever need to use pessimistic isolation, you should explicitly specify when a transaction begins and when it ends. Although pessimistic concurrency is rarely used, I acknowledge there is a shortcoming in the current design, which I'm going to fix immediately. I just realized another reason why we should support TransactionBegin. Sitefinity's data providers can be replaced with custom implementations which may not be OpenAccess based and may not support implicit transactions.

    I'd like to touch on another closely related subject, Distributed Transactions. As you may know, in Sitefinity every module goes with its own set of data providers. Each of these provides maybe connected to different database or a data source that is not even RDBMS. Sitefinity 4.0 supports distributed transactions and they are automatically handled in the fluent API. Sitefinity detects if there are objects in the current scope that are connected to different sources and automatically puts them in distributed transaction. The current implementation is done on top of the build-in OpenAccess transaction. It uses two phase commit but it is quite limited and it has a flaw that usually won't cause a problem but still I have to find different solution. The problem is that in some cases the data may become accessible between the phases. My plan is to add support for System.Transactions, but System.Transactions are not supported by all databases. How important is this to your business? Any opinions on System.Transactions?

    Here are some links with more details on OpenAccess transaction model and distributed transactions:
    http://www.telerik.com/help/openaccess-orm/objectscopes,transactions-and-threads.html
    http://www.telerik.com/help/openaccess-orm/concurrency-control-optimistic.html
    http://www.telerik.com/help/openaccess-orm/system.transactions.html
    http://www.microsoft.com/downloads/en/details.aspx?FamilyId=AAC3D722-444C-4E27-8B2E-C6157ED16B15&displaylang=en

  11. John Waters Sep 10, 2010

    My 2 cents:
    The fluent API is just a programming convenience, you can use it or not, that is up to your preference
    As far as the transaction handling, there is a SaveChanges method, so the programmer can explicitly Commit, so as to say. So that part is OK, but changes should be discarded if SaveChanges is not called, not saved automatically on disposal. I think the AutoSave is a bad pattern. 

    One thing to point out regarding Dispose not being called until the object is garbage collected: that is true if you create an object that implements IDisposable, use it, then wait for it to be collected.

    But with using() { }, it is equivalent to

    x = new ObjectThatImplementsDisposable();
    try {  }
    finally { x.Dispose; }

    So it does get called as soon as you exit the using statement, not when x is garbage collected.

  12. Bob Sep 10, 2010

    Hey Lino,

    I hope you are doing well.

    All facades have CancelChanges() method, which roles back the current transaction. I agree it looks kind of odd to use CancelChanges() in using block, but apparently the support of using adds more confusion then benefit so maybe we should remove it.

    On the second point, this is exactly how the API currently works.

  13. Lupi Sep 10, 2010
    Hello, have you checked the Fluent nHibernate project?


    I wonder about the dynamic (ex meta) data - after you extend a certain module, say News with additional properties - will the Fluent API "extends" itself automatically to handle the new changes?

    Lupi
  14. Hardy Erlinger Sep 13, 2010
    Hi,

    using the fluent API is and should be an optional convenience approach that is applicable to simple and everyday tasks, such as creating a new blog entry, deleting a comment etc. -- tasks that usually involve a single entity or set of entities. The fluent API should be as simple and intuitive as possible, whereas the complex actions should be left for the standard API. There I want to have all the options and be able to explizitly control all transactions, their rollback and their scopes. 

    Making fluent API calls part of a a transaction by wrapping them in a using-statement is fine as long as I have to call SaveChanges at the end. If I don't, the changes will be rolled back, just like it is with System.Transactions.TransactionScope. Automagically saving changes at the end of the using-block without any explicit programmer interaction is the exact opposite of what I, personally, would expect, and seems quite unintuitive to me.

    Cheers,
    Hardy

    Leave a comment