Home .NET Data virtualization in WPF

Data virtualization in WPF

by admin

Good afternoon.
I’ve long been interested in writing my own class to optimally load information from a database, for example when the number of records is more than 10 million records.
Delayed loading of information, using multiple data sources, etc.
I couldn’t find a post on hubra about this topic, so I present to you my translation of Paul McClean’s article, which was the starting point in solving this problem.
Original article : here
Source project files : here
From here on in the text I will write on behalf of the author.


WPF provides some interesting UI virtualization capabilities to handle large collections efficiently, at least from a UI perspective, but provides no general method for data virtualization. While many posts on the forums discuss data virtualization, no one (to my knowledge) has published a solution. This article presents one such solution.


User interface virtualization

When a WPF ItemsControl control is associated with a large collection of source data with the UI virtualization setting enabled, the control creates visual containers only for visible elements (plus a few at the top and bottom). This is usually a small portion of the source collection. When the user scrolls through the list, new visual containers are created when items become visible, and old containers are destroyed when items become invisible. By reusing visual containers, we reduce the overhead of creating and destroying objects.
UI virtualization means that a control can be associated with a large collection of data and still take up less memory due to the small number of visible containers.

Data virtualization

Data virtualization is a term that means achieving virtualization for the data object associated with the ItemsControl. Data virtualization is not provided in WPF. For relatively small collections of underlying objects, memory consumption does not matter. However, for large collections, memory consumption can become very significant. In addition, retrieving information from the database or creating objects can take a long time, especially with network operations. For these reasons, it is desirable to use some kind of data virtualization mechanism to limit the number of data objects that must be retrieved from the source and placed in memory.



This solution is based on the fact that when the ItemsControl is associated with an IList implementation rather than an IEnumerable, hence it does not list the entire list, but instead provides only a selection of items needed for display. It uses the Count property to determine the size of the collection, to set the size of the scrollbar. In the future, it will scroll through the screen items via the list indexer. In this way, you can create an IList that can report that it has a large number of items, and only get items as needed.


In order to use this solution, the underlying source must be able to provide information about the number of items in the collection, and provide a small portion (or page) of the entire collection. These requirements are expressed in the IItemsProvider interface.

/// <summary>/// Represents the supplier of collection details./// </summary>/// <typeparam name="T"> Type of item in collection</typeparam>public interface IItemsProvider<T>{///<summary>/// Get the total number of available items/// </summary>/// <returns> </returns>int FetchCount();/// <summary>/// Get the range of elements/// </summary>/// <param name="startIndex"> Initial index</param>/// <param name="count"> Number of elements to get</param>;///<returns> </returns>IList<T> FetchRange(int startIndex, int count);}

If the underlying data source is a database query, it is relatively easy to implement the IItemsProvider interface using the COUNT() aggregate function, or the OFFSET and LIMIT expressions provided by most database providers.


This is an implementation of the IList interface which virtualizes data. VirtualizingCollection<T> divides the entire collection space into a number of pages. The pages are loaded into memory as needed, and destroyed when not needed.
Interesting points will be discussed below. Please see the source code attached to this article for details.
The first aspect of the IList implementation is the implementation of the Count property. It is used by the ItemsControl to estimate the size of the collection and to draw a scrollbar.

Private int _count = -1;public virtual int Count{get{if (_count == -1){LoadCount();}return _count;}protected set{_count = value;}}protected virtual void LoadCount(){Count = FetchCount();}protected int FetchCount(){return ItemsProvider.FetchCount();}

The Count property is implemented using the lazy load pattern. It uses a special value of -1 to indicate that the value has not yet been loaded. The first time the property is accessed it will load the actual number of items from the ItemsProvider.
Another important aspect of the IList interface is the implementation of the indexer.

public T this[int index]{get{// define what page and offset within a pageint pageIndex = index / PageSize;int pageOffset = index % PageSize;// request the main pageRequestPage(pageIndex);// if more than 50 percent of the request was addressed, then request the next pageif ( pageOffset > PageSize/2 pageIndex < Count / PageSize)RequestPage(pageIndex + 1);// if less than 50% were requested, then request the previous pageif (pageOffset < PageSize/2 pageIndex > 0)RequestPage(pageIndex - 1);// remove obsolete pagesCleanUpPages();// protective check in case of asynchronous loadingif (_pages[pageIndex] == null)return default(T);//return the requested elementreturn _pages[pageIndex][pageOffset];}set { throw new NotSupportedException(); }}

The indexer is the most unique part of the solution. First, it must determine which page the requested element belongs to (pageIndex) and the offset within the page (pageOffset). Then the RequestPage() method is called, returning the page.
Then the next or previous page is loaded based on the pageOffset variable. This is based on the assumption that if users are viewing page 0, there is a good chance that they will scroll down to view page 1. Getting the data in advance does not cause data skipping when displayed on the screen.
CleanUpPages() is called to clear (or unload) pages that are not in use.
Finally, a page protection check. This check is necessary in case the RequstPage() method does not work in synchronous mode, as with the AsyncVirtualizingCollection<T> derived class.

private readonly Dictionary<int, IList<T> > _pages =new Dictionary<int, IList<T> > ();private readonly Dictionary<int, DateTime> _pageTouchTimes =new Dictionary<int, DateTime> ();protected virtual void RequestPage(int pageIndex){if (!_pages.ContainsKey(pageIndex)){_pages.Add(pageIndex, null);_pageTouchTimes.Add(pageIndex, DateTime.Now);LoadPage(pageIndex);}else{_pageTouchTimes[pageIndex] = DateTime.Now;}}protected virtual void PopulatePage(int pageIndex, IList<T> page){if (_pages.ContainsKey(pageIndex))_pages[pageIndex] = page;}public void CleanUpPages(){List<int> keys = new List<int> (_pageTouchTimes.Keys);foreach (int key in keys){// page 0 is a special case, since the WPF ItemsControl// accesses the first item frequentlyif ( key != 0 (DateTime.Now -_pageTouchTimes[key]).TotalMilliseconds > PageTimeout ){_pages.Remove(key);_pageTouchTimes.Remove(key);}}}

Pages are stored in a Dictionary, which uses the index as the key. The Dictionary is also used to store information about the time last used. This time is updated every time the page is accessed. It is used by CleanUpPages() to remove pages that have not been accessed in a significant amount of time.

protected virtual void LoadPage(int pageIndex){PopulatePage(pageIndex, FetchPage(pageIndex));}protected IList<T> FetchPage(int pageIndex){return ItemsProvider.FetchRange(pageIndex*PageSize, PageSize);}

Finally, FetchPage() fetches a page from the ItemsProvider, and the LoadPage() method works by calling the PopulatePage() method, which places the page in the dictionary with the given index.
It may seem like there are a lot of non-essential methods in the code, but they were designed that way for a reason. Each method performs exactly one task. This helps to keep the code readable and also makes it easy to extend and modify functionality in derived classes, as will be observed later.
The VirtualizingCollection< T> class achieves the main goal of performing data virtualization. Unfortunately, this class has one major disadvantage in use – all the data retrieval methods are executed synchronously. This means that they are started by UI threads, which potentially slows down the application as a result.

AsyncVirtualizingCollection< T>

The AsyncVirtualizingCollection< T> class is inherited from VirtualizingCollection< T> , and overrides the Load() method to implement asynchronous data loading. A key feature of an asynchronous data source is that it must notify the user interface via its data binding when it receives data. In normal objects, this is solved by using the INotifyPropertyChanged interface. To implement collections, its close relative INotifyCollectionChanged should be used. This interface is used by the class ObservableCollection< T>

public event NotifyCollectionChangedEventHandler CollectionChanged;protected virtual void OnCollectionChanged(NotifyCollectionChangedEventArgs e){NotifyCollectionChangedEventHandler h = CollectionChanged;if (h != null)h(this, e);}private void FireCollectionReset(){NotifyCollectionChangedEventArgs e =new NotifyCollectionChangedEventArgs(NotifyCollectionChangedAction.Reset);OnCollectionChanged(e);}public event PropertyChangedEventHandler PropertyChanged;protected virtual void OnPropertyChanged(PropertyChangedEventArgs e){PropertyChangedEventHandler h = PropertyChanged;if (h != null)h(this, e);}private void FirePropertyChanged(string propertyName){PropertyChangedEventArgs e = new PropertyChangedEventArgs(propertyName);OnPropertyChanged(e);}

The AsyncVirtualizingCollection< T> class implements both the INotifyPropertyChanged and INotifyCollectionChanged interfaces to provide maximum bindability. There is nothing to note in this implementation.

protected override void LoadCount(){Count = 0;IsLoading = true;ThreadPool.QueueUserWorkItem(LoadCountWork);}private void LoadCountWork(object args){int count = FetchCount();SynchronizationContext.Send(LoadCountCompleted, count);}private void LoadCountCompleted(object args){Count = (int)args;IsLoading = false;FireCollectionReset();}

In the overridden LoadCount() method, the get is called asynchronously through the ThreadPool. When finished, the new count will be set and the FireCollectionReset() method will be called updating the user interface via InotifyCollectionChanged. Note that the LoadCountCompleted method is called from the UI thread through the use of the SynchronizationContext. The SynchronizationContext property is set in the class constructor, with the assumption that the collection instance will be created in the UI thread.

protected override void LoadPage(int index){IsLoading = true;ThreadPool.QueueUserWorkItem(LoadPageWork, index);}private void LoadPageWork(object args){int pageIndex = (int)args;IList<T> page = FetchPage(pageIndex);SynchronizationContext.Send(LoadPageCompleted, new object[]{ pageIndex, page });}private void LoadPageCompleted(object args){int pageIndex = (int)((object[]) args)[0];IList<T> page = (IList<T> )((object[])args)[1];PopulatePage(pageIndex, page);IsLoading = false;FireCollectionReset();}

Asynchronous page data loading follows the same rules, and again the FireCollectionReset() method is used to update the user interface.
Note also the IsLoading property. This is a simple flag that can be used by the UI to indicate that the collection is loading. When the IsLoading property changes, the FirePropertyChanged() method causes the UI to update through the INotifyProperyChanged mechanism.

public bool IsLoading{get{return _isLoading;}set{if ( value != _isLoading ){_isLoading = value;FirePropertyChanged("IsLoading");}}}

Demonstration project

In order to demonstrate this solution, I created a simple demo project (included in the source codes of the project).
First, an implementation of the IItemsProvider class was created that provides dummy data with a stream stop to simulate the delay in getting data from disk or over the network.

public class DemoCustomerProvider : IItemsProvider<Customer>{private readonly int _count;private readonly int _fetchDelay;public DemoCustomerProvider(int count, int fetchDelay){_count = count;_fetchDelay = fetchDelay;}public int FetchCount(){Thread.Sleep(_fetchDelay);return _count;}public IList<Customer> FetchRange(int startIndex, int count){Thread.Sleep(_fetchDelay);List<Customer> list = new List<Customer> ();for( int i=startIndex; i<startIndex+count; i++ ){Customer customer = new Customer {Id = i+1, Name = "Customer " + (i+1)};list.Add(customer);}return list;}}

The ubiquitous Customer object is used as a collection item.
A simple WPF window with a ListView control was created to allow the user to experiment with different list implementations.

<Window x:Class="DataVirtualization.DemoWindow"xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"Title="Data Virtualization Demo - By Paul McClean" Height="600" Width="600"><Window.Resources><Style x:Key="lvStyle" TargetType="{x:Type ListView}"><Setter Property="VirtualizingStackPanel.IsVirtualizing" Value="True"/><Setter Property="VirtualizingStackPanel.VirtualizationMode" Value="Recycling"/><Setter Property="ScrollViewer.IsDeferredScrollingEnabled" Value="True"/><Setter Property="ListView.ItemsSource" Value="{Binding}"/><Setter Property="ListView.View"><Setter.Value><GridView><GridViewColumn Header="Id" Width="100"><GridViewColumn.CellTemplate><DataTemplate><TextBlock Text="{Binding Id}"/></DataTemplate></GridViewColumn.CellTemplate></GridViewColumn><GridViewColumn Header="Name" Width="150"><GridViewColumn.CellTemplate><DataTemplate><TextBlock Text="{Binding Name}"/></DataTemplate></GridViewColumn.CellTemplate></GridViewColumn></GridView></Setter.Value></Setter><Style.Triggers><DataTrigger Binding="{Binding IsLoading}" Value="True"><Setter Property="ListView.Cursor" Value="Wait"/><Setter Property="ListView.Background" Value="LightGray"/></DataTrigger></Style.Triggers></Style></Window.Resources><Grid Margin="5"><Grid.RowDefinitions><RowDefinition Height="Auto"/><RowDefinition Height="Auto"/><RowDefinition Height="Auto"/><RowDefinition Height="*"/></Grid.RowDefinitions><GroupBox Grid.Row="0" Header="ItemsProvider"><StackPanel Orientation="Horizontal" Margin="0, 2, 0, 0"><TextBlock Text="Number of items:" Margin="5"TextAlignment="Right" VerticalAlignment="Center"/><TextBox x:Name="tbNumItems" Margin="5"Text="1000000" Width="60" VerticalAlignment="Center"/><TextBlock Text="Fetch Delay (ms):" Margin="5"TextAlignment="Right" VerticalAlignment="Center"/><TextBox x:Name="tbFetchDelay" Margin="5"Text="1000" Width="60" VerticalAlignment="Center"/></StackPanel></GroupBox><GroupBox Grid.Row="1" Header="Collection"><StackPanel><StackPanel Orientation="Horizontal" Margin="0, 2, 0, 0"><TextBlock Text="Type:" Margin="5"TextAlignment="Right" VerticalAlignment="Center"/><RadioButton x:Name="rbNormal" GroupName="rbGroup"Margin="5" Content="List(T)" VerticalAlignment="Center"/><RadioButton x:Name="rbVirtualizing" GroupName="rbGroup"Margin="5" Content="VirtualizingList(T)"VerticalAlignment="Center"/><RadioButton x:Name="rbAsync" GroupName="rbGroup"Margin="5" Content="AsyncVirtualizingList(T)"IsChecked="True" VerticalAlignment="Center"/></StackPanel><StackPanel Orientation="Horizontal" Margin="0, 2, 0, 0"><TextBlock Text="Page size:" Margin="5"TextAlignment="Right" VerticalAlignment="Center"/><TextBox x:Name="tbPageSize" Margin="5"Text="100" Width="60" VerticalAlignment="Center"/><TextBlock Text="Page timeout (s):" Margin="5"TextAlignment="Right" VerticalAlignment="Center"/><TextBox x:Name="tbPageTimeout" Margin="5"Text="30" Width="60" VerticalAlignment="Center"/></StackPanel></StackPanel></GroupBox><StackPanel Orientation="Horizontal" Grid.Row="2"><TextBlock Text="Memory Usage:" Margin="5"VerticalAlignment="Center"/><TextBlock x:Name="tbMemory" Margin="5"Width="80" VerticalAlignment="Center"/><Button Content="Refresh" Click="Button_Click"Margin="5" Width="100" VerticalAlignment="Center"/><Rectangle Name="rectangle" Width="20" Height="20"Fill="Blue" Margin="5" VerticalAlignment="Center"><Rectangle.RenderTransform><RotateTransform Angle="0" CenterX="10" CenterY="10"/></Rectangle.RenderTransform><Rectangle.Triggers><EventTrigger RoutedEvent="Rectangle.Loaded"><BeginStoryboard><Storyboard><DoubleAnimation Storyboard.TargetName="rectangle"Storyboard.TargetProperty="(TextBlock.RenderTransform).(RotateTransform.Angle)"From="0" To="360" Duration="0:0:5"RepeatBehavior="Forever" /></Storyboard></BeginStoryboard></EventTrigger></Rectangle.Triggers></Rectangle><TextBlock Margin="5" VerticalAlignment="Center"FontStyle="Italic" Text="Pause in animation indicates UI thread stalled."/></StackPanel><ListView Grid.Row="3" Margin="5" Style="{DynamicResource lvStyle}"/></Grid></Window>

It’s not worth going into the details of XAML. The only thing worth noting is the use of the given ListView styles to change the background and mouse cursor in response to changing the IsLoading property.

public partial class DemoWindow{/// <summary>/// Initializes a new instance of the <see cref="DemoWindow"/> class./// </summary>public DemoWindow(){InitializeComponent();// use a timer to periodically update the memory usageDispatcherTimer timer = new DispatcherTimer();timer.Interval = new TimeSpan(0, 0, 1);timer.Tick += timer_Tick;timer.Start();}private void timer_Tick(object sender, EventArgs e){tbMemory.Text = string.Format("{0:0.00} MB", GC.GetTotalMemory(true)/1024.0/1024.0);}private void Button_Click(object sender, RoutedEventArgs e){// create the demo items provider according to specified parametersint numItems = int.Parse(tbNumItems.Text);int fetchDelay = int.Parse(tbFetchDelay.Text);DemoCustomerProvider customerProvider =new DemoCustomerProvider(numItems, fetchDelay);// create the collection according to specified parametersint pageSize = int.Parse(tbPageSize.Text);int pageTimeout = int.Parse(tbPageTimeout.Text);if ( rbNormal.IsChecked.Value ){DataContext = new List<Customer> (customerProvider.FetchRange(0, customerProvider.FetchCount()));}else if ( rbVirtualizing.IsChecked.Value ){DataContext = new VirtualizingCollection<Customer> (customerProvider, pageSize);}else if ( rbAsync.IsChecked.Value ){DataContext = new AsyncVirtualizingCollection<Customer> (customerProvider, pageSize, pageTimeout*1000);}}}

The window layout is pretty simple, but enough to demonstrate the solution.
The user can adjust the number of items in the DemoCustomerProvider instance and the delay simulator time.
The demo allows users to compare the standard List(T) implementation, the VirtualizingCollection(T) implementation with synchronous data loading, and the AsyncVirtualizingCollection(T) implementation with asynchronous data loading. When using VirtualizingCollection(T) and AsyncVirtualizingCollection(T), the user can set the page size and timeout (sets the time after which the page should be unloaded from memory). These should be chosen according to the characteristics of the item and the expected usage pattern.
Data virtualization in WPF
To compare the different collection types, the window also displays the total amount of memory used. A rotating square animation is used to visualize the stopping flow of the user interface. In a fully asynchronous solution, the animation should not stall or stop.

You may also like