Share via


using LINQ to select from DataTable

Question

Wednesday, August 27, 2014 5:50 PM

Hello, I have the following sample data(below) that is in a datatable. I need to select into a new datatable distinct rows by colname and where the most occurance of Datatype, then if a tie exists with DataType use Length.

I can get the correct data with this sql statement if the data was in a sql database table:

  SELECT
      *
FROM (
            SELECT
                  *
                , ROW_NUMBER() OVER (PARTITION BY Datatype
                                     ORDER BY length DESC) AS w
            FROM TN_ImportFileColumns
      ) x
WHERE w = 1

What I need to have:

ColumnName DataType    Length
created         datetime    10
phone           int         10
name            string          4

From this sample data:

ColumnName ExampleData DataType    Length
name            scott           string          4
name            ben         string          3
name            kim         string          3
phone           1111111111  int         10
phone           222222          int         6
phone                       string          0
created         12-31-2014  datetime    10
created         11-09-2013  datetime    10
created         11092013    int         8

Some background info:

The data in the datatable comes from a parse of a text file where I'm trying to determine the most likely datatype for a give row. In my sample data there are 3 columns: name, phone and created. But you see that created has 2 entries for datetime and one for int. Since column(datatype) is most likely datetime then that is the row I need for the given columnName.

newb

All replies (7)

Wednesday, August 27, 2014 6:29 PM ✅Answered

You will most likely use a Linq projection like in the code below and build a collection of custom objects. Then you can walk the collection with foreach loop and build table records from the objects. You use a distinct function. But more likely, you'll have to go into some kind of foreach loop to detemine what the column type is and get records that way.

Yiu might be able to even use Linq with a Where clause to  selete based on column type.

http://www.blackwasp.co.uk/LinqProjection.aspx

http://blogs.msdn.com/b/charlie/archive/2006/11/19/linq-farm-group-and-distinct.aspx

// Fill the DataSet.
DataSet ds = new DataSet();
ds.Locale = CultureInfo.InvariantCulture;
FillDataSet(ds);

DataTable products = ds.Tables["Product"];

var query = products.AsEnumerable().
    Select(product => new
    {
        ProductName = product.Field<string>("Name"),
        ProductNumber = product.Field<string>("ProductNumber"),
        Price = product.Field<decimal>("ListPrice")
    });

Console.WriteLine("Product Info:");
foreach (var productInfo in query)
{
    Console.WriteLine("Product name: {0} Product number: {1} List price: ${2} ",
        productInfo.ProductName, productInfo.ProductNumber, productInfo.Price);
}


Thursday, August 28, 2014 12:42 PM ✅Answered

I started looking into writing something for this.

Working with datatables in LINQ is a right nuisance.

Personally....

I would probably write the data to a database.

A local sql ce database perhaps.

You can then use sql to manipulate the data, which will be way less painfull.

I'm not sure if sqlbulkcopy works with sql ce but it would with proper sql server to a temporary table.

public bool BulkInsertDataTable(string tableName, DataTable dataTable)
{
    bool isSuccuss;
    try
    {
        SqlConnection SqlConnectionObj = GetSQLConnection();
        SqlBulkCopy bulkCopy = new SqlBulkCopy(SqlConnectionObj, SqlBulkCopyOptions.TableLock | SqlBulkCopyOptions.FireTriggers | SqlBulkCopyOptions.UseInternalTransaction, null);
        bulkCopy.DestinationTableName = tableName;
        bulkCopy.WriteToServer(dataTable);
        isSuccuss = true;
    }
    catch (Exception ex)
    {
        isSuccuss = false;
    }
    return isSuccuss;
}

Once you have the data there you can use sql.

That partition stuff probably wouldn't work with sql ce, you might need to nest queries.

Hope that helps
Please don't forget to up vote answers you like or which help you and mark one(s) which answer your question.


Thursday, August 28, 2014 3:39 PM ✅Answered

Try this

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        DataTable dt = new DataTable();
        DataTable dt1 = new DataTable();
        List<List<object>> input = new List<List<object>>() {
         new List<object>{ "name", "scott",   typeof(string) , 4},
         new List<object>{ "name", "ben", typeof(string), 3},
         new List<object>{ "name", "kim", typeof(string), 3},
         new List<object>{ "phone", 1111111111, typeof(int), 10},
         new List<object>{ "phone", 222222, typeof(int), 6},
         new List<object>{ "phone",null,    typeof(string),0},
         new List<object>{ "created", 12-31-2014, typeof(DateTime), 10},
         new List<object>{ "created", 11-09-2013, typeof(DateTime),10},
         new List<object>{ "created", 11092013, typeof(int),8}
         };
        public Form1()
        {
            InitializeComponent();
            dt.Columns.Add("ColumnName", typeof(string));
            dt.Columns.Add("ExampleData", typeof(object));
            dt.Columns.Add("DataType", typeof(Type));
            dt.Columns.Add("Length", typeof(int));
            
            foreach(List<object> row in input)
            {
                DataRow newRow = dt.Rows.Add();
                newRow.ItemArray = row.ToArray();
                dataGridView1.DataSource = dt;
            }

        }

        private void button1_Click(object sender, EventArgs e)
        {
            var results = (dt.AsEnumerable()
                .GroupBy(x => x.Field<string>("ColumnName")) 
                .ToList()
                .Select( x => new { 
                    ColumnName = x.Key, 
                    Type = x.Take(1).Single().Field<Type>("DataType"),
                    Length = x.Select(y => y.Field<int>("Length")).Max()
                }));

            dt1.Columns.Add("ColumnName", typeof(string));
            dt1.Columns.Add("DataType", typeof(Type));
            dt1.Columns.Add("Length", typeof(int));
            foreach (var row in results)
            {
                dt1.Rows.Add(new object[]{ row.ColumnName, row.Type, row.Length });
            }
            
            dataGridView1.DataSource = dt1;
        }
    }
}

jdweng


Friday, August 29, 2014 4:04 PM ✅Answered

Joel's query is close, but doesn't return the "DataType" column from the row that has the largest Length.

You can grab the target row first, then select the values from that.

            var results = dt.AsEnumerable()
                            .GroupBy(x => x.Field<string>("ColumnName"))
                            .Select(x => new
                            {
                                Key = x.Key,
                                FirstRow = x.OrderBy(y => y.Field<int>("Length")).First()
                            })
                            .Select( x => new 
                            {
                                ColumnName = x.Key,
                                DataType = x.FirstRow.Field<Type>("DataType"),
                                Length = x.FirstRow.Field<int>("Length")
                            }
                            );

David

David http://blogs.msdn.com/b/dbrowne/


Wednesday, August 27, 2014 7:40 PM

I'm not sure it's that hard.

Breaking the algorithm down:

First group by ColumnName, Datatype, Count(*) as c, Max(Length) as mx

Then group by Columname, Datatype, c, mx,    sorted by Columname, Datatype, c desc, mx desc

Hope that helps
Please don't forget to up vote answers you like or which help you and mark one(s) which answer your question.


Thursday, August 28, 2014 2:47 AM

Worst case scenario. If you cannot do everything in one LINQ statement then you can always spilt them each SELECT a LINQ query, and loop in through to filter.

chanmm

chanmm


Friday, August 29, 2014 12:43 PM

Thank you everyone for your assistance. I learning and testing. I will get back to everyone with my solution. I like jdweng's solution best so far.

newb