Greetings community,

I've exhausted all ideas and patience and turn to the community at large for help. I have data formatted as an ASCII text file with easting, northing, and depth fields. I've attempted an arrayList with some success, but there has to be a better and more efficient approach.

'============================================
What I am trying to do written in Pseudocode:
'Open a text file containing XYZ data (easting, northing, depth)
'Data string looks like this:

Easting Northing Depth
361274.1853 4769915.1568 -49.55
361274.1653 4769915.1554 -49.58
361274.1553 4769915.1548 -49.62
361274.1453 4769915.1541 -49.65

'Split each line into a 3D array
'Performing some math functions on the depth field_
'Math functions mimic a Exploratory Data Analysis_
'e.g. mean, std dev, skewness, least squares, etc

'Based on the results of the math functions_
'Decide what to keep and what to randomly remove_
'this is essentially stratified random sampling of the data

'Write the data back to a delimited text file_
'Easy right? :) <in a sarcastic tone>
'==============================================

Performance is especially critical since I am attempting to populate the 3D array with approximately 3000 data samples.

I should also note that I am NOT a computer programmer by trade, but treat it with great interest! Just need to find more time to devote to it :)

Thank you for your time and consideration!

PS: I am looking to turn this into a research paper at a future conference and would be more than happy to include the programmer as a co-author.

I know its not much, but here's what I've got so far:

Dim dataLine As String
        Dim xyzFile As String
        Dim xyzArray(3, 3, 3) As String 'not sure if this is right
        Dim x As Double = 0
        Dim y As Double = 0
        Dim z As Double = 0
        Dim columnCount As Integer
        Dim iCount As Integer = columnCount - 1
        Dim xyzArray(iCount, 3) As Double


        Dim sr As New StreamReader("Datafile.txt")
        Do
            dataLine = sr.ReadLine()
            For x = 0 To 3
                For y = 0 To 3
                    For z = 0 To 3
                        xyzArray(x, y, z) = x + y + z
                        MessageBox.Show(xyzArray(x, y, z))
                    Next z
                Next y
            Next x
            xyzFile = xyzFile & dataLine
        Loop Until dataLine Is Nothing
        sr.Close()
        xyzArray(x, y, z) = Split(xyzFile, " ")

if performance is the problem i propose u read the whole data into memory as a string and make necessary calculation in the memory
i assumes u have the fixed length record so what u have to do prepare the buffer for 3000 records x recordlength and read the sequential file as the string not read line by line
it will much faster because it use less I/O and calculation based on data in memory is much faster
the problem is u hv to write the program to do it

Thanks for the prompt response Ahmad! So if I read the file using an instance of streamreader, how would I pull out the depth values and perform calculations in memory?

>Split each line into a 3D array

You need to create an array of 3 elements of double type.

Take a look at this sample,

Dim lines() As String = System.IO.File.ReadAllLines("DataFile.txt")

        Dim list As New List(Of Double())
        For Each str As String In lines
            Dim ele() As Double = Array.ConvertAll(str.Split(" "), Function(k) Double.Parse(k))
            list.Add(ele)
        Next

        For Each d As Double() In list
            Console.WriteLine(d(0) & " " & d(1) & " " & d(2))
        Next

This is great! Thank you adatapost. This is exactly what I was looking for. After being accustomed to using VB6 for so long, I was unaware of how powerful lists where in VB.NET and stuck with for loops. I do have another question though, could you explain what is actually happening on this line:

Dim ele() As Double = Array.ConvertAll(str.Split(" "), Function(k) Double.Parse(k))

I read this as declare a variable called 'ele' to store strings that are converted to an Array using space as a delimiter, but what is Function(k) and Double.Parse(k) actually doing? Is this analogous to a tuple?

Much appreciation!

Dim ele() As Double = Array.ConvertAll(str.Split(" "), Function(k) Double.Parse(k))

ConvertAll method convert elements of string datatype into double type.

Function(k) Double.Parse(k) - is called lambda expression (anonymous method) and it convert each element of string array into double type.

commented: adatapost is extremely helpful!!! +0

i think the solution given by adatapost is great and hope solve the problem

Thanks again adatapost for your assistance, I do have another question though. How is VB.NET 2008 handling a traditional array differently from VB6 using a list object with a nested array? I'm trying to call private sub routines through the array that were parsed with lamba expression (using the old VB6 syntax), but my sub routines do not seem to recognize the ele() array (VB Studio reports "Name 'ele' is not declared". Any suggestions?

Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Dim lines() As String = System.IO.File.ReadAllLines("FILE_TEXT.txt")

        Dim list As New List(Of Double())
        For Each str As String In lines
            Dim ele() As Double = Array.ConvertAll(str.Split(" "), Function(k) Double.Parse(k)) 'lambda expression
            list.Add(ele)
        Next

        For Each d As Double() In list
            'Console.WriteLine(d(0) & " " & d(1) & " " & d(2))
            TextBox1.Text = TextBox1.Text & d(0) & " " & d(1) & " " & d(2) & vbCrLf
        Next

        Dim meanVal As Double
        Me.calcMean(ele(), meanVal) 'Error: Name 'ele' is not declared
        Console.WriteLine(ele(), meanVal) 'Error: Name 'ele' is not declared

        'Calc Standard Deviation

        'Calc Data Skewness

        'Calc Data Kurtosis

    End Sub

    Private Sub calcMean(ByRef ele() As Double, ByRef meanVal As Double)

        Dim sum As Double
        Dim i As Integer
        For i = 0 To 3 Step 3 'loop is hardcoded for debugging
            sum = sum + ele(i)
        Next i
        meanVal = sum / 4 'Temporary values for debugging

    End Sub

    Private Sub calcStDev(ByRef ele() As Double, ByRef stDev As Double, ByRef meanVal As Double)
        Dim i As Integer
        Dim x1 As Double
        Dim x2 As Double

        For i = 0 To 3 Step 3 'loop is hardcoded for debugging
            x1 = (ele(i) - meanVal) ^ 2
            x2 = x2 + x1
        Next
        stDev = Math.Sqrt(x2 / 3) 'Temporary values for debugging

    End Sub


    Private Sub calcSkewness(ByRef ele() As Double, ByRef stDev As Double, ByRef meanVal As Double, ByRef skewness As Double)
        Dim stdDev As Object

        Dim i As Integer
        Dim y1 As Double
        Dim y2 As Double

        For i = 0 To 3 Step 3
            y1 = (ele(i) - meanVal) ^ 3
            y2 = y2 + y1
        Next i
        skewness = (y2) / (3 * (stdDev ^ 3))

    End Sub


    Private Sub calcKurtosis(ByRef ele() As Double, ByRef stDev As Double, ByRef meanVal As Double, ByRef kurtosis As Double)

        Dim i As Integer
        Dim z1 As Double
        Dim z2 As Double

        For i = 0 To 3 Step 3
            z1 = (ele(i) - meanVal) ^ 4
            z2 = z2 + z1
        Next i
        kurtosis = (z2) / (3 * (stDev ^ 4))

    End Sub

End Class

The scope of vaiable ele is local inside (because it is declared inside the loop) the for loop so you can't use that variable outside the loop.

Here in the following code, for each read one dim double array from the list and stores it into the variable d.

....
 For Each d As Double() In list

         Dim meanVal As Double
         Me.calcMean(d, meanVal) 
         Console.WriteLine(d(0) & " " & d(1) & " " & d(2) & "  " & meanVal)
Next

Thanks again adatapost. I really wish you knew who much help you have given me and how grateful I am!

You're welcome drummy.

I'm glad you got it working. Please mark this thread as solved if you have found an answer to your question and good luck!

Sorry adatapost, I'm just realizing that my answers are very peculiar:

Here's my table:
361274.1853 4769915.1568 -49.55
361274.1653 4769915.1554 -49.58
361274.1553 4769915.1548 -49.62
361274.1453 4769915.1541 -49.65

Answer from averages for each field should read:
361274.1628 4769915.155275 -49.6

BUT, this is what I get instead for averages are:
5780386.9648 19079660.6272 -49.55
5780386.64480001 19079660.6216 -49.58
5780386.4848 19079660.6192 -49.62
5780386.32480001 19079660.6164 -49.65

The loop in my subroutine "calcMean" seems to be stuck on the first elements in the collection list. How do I increment this? My apologies for the questions, collections in VB.net are a really new concept to me. Any assistance would be appreciated!

Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Dim lines() As String = System.IO.File.ReadAllLines("FILE_Text.txt")

        Dim list As New List(Of Double())
        For Each str As String In lines
            Dim ele() As Double = Array.ConvertAll(str.Split(" "), Function(k) Double.Parse(k)) 'lambda expression > .NET v3.5 Framework only
            list.Add(ele)
        Next

        For Each d As Double() In list

            'Calculate Mean Values for easting (NADIR), northing(NADIR), and depth
            Dim eastMeanVal, northMeanVal, depthMeanVal As Double
            Me.calcMean(d, eastMeanVal, northMeanVal, depthMeanVal)

            'Console.WriteLine(d(0) & " " & d(1) & " " & d(2))
            TextBox1.Text = TextBox1.Text & d(0) & vbTab & d(1) & vbTab & d(2) & vbTab & northMeanVal & vbTab & eastMeanVal & vbTab & depthMeanVal & vbCrLf

        Next

    End Sub

    Private Sub calcMean(ByRef d() As Double, ByRef eastMeanVal As Double, ByRef northMeanVal As Double, ByRef depthMeanVal As Double)

        Dim eastSum As Double = 0
        Dim northSum As Double = 0
        Dim depthSum As Double = 0
        Dim i, j, k As Integer

        For k = 0 To 4 'd.GetUpperBound(2) ' debug
            depthSum += d(2)  'getting the same return value. d(2) is not incrementing
            For j = 0 To 4 'd.GetUpperBound(1) '
                eastSum += d(1)  'getting the same return value. d(1) is not incrementing
                For i = 0 To 4 'd.GetUpperBound(0) '
                    northSum += d(0)  'getting the same return value. d(0) is not incrementing
                Next i
            Next j
        Next k

        eastMeanVal = eastSum / 4 ' UBound(d) 'for debug
        northMeanVal = northSum / 4
        depthMeanVal = depthSum / 4

    End Sub

Correct me if I am wrong. I think you should have to pass a list instead of an array to the calcMean method.

Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Dim lines() As String = System.IO.File.ReadAllLines("FILE_Text.txt")

        Dim list As New List(Of Double())
        For Each str As String In lines
            Dim ele() As Double = Array.ConvertAll(str.Split(" "), Function(k) Double.Parse(k)) 'lambda expression > .NET v3.5 Framework only
            list.Add(ele)
        Next

        Dim eastMeanVal, northMeanVal, depthMeanVal As Double
        Me.calcMean(list, eastMeanVal, northMeanVal, depthMeanVal)

        TextBox1.Text = northMeanVal & vbTab & eastMeanVal & vbTab & depthMeanVal & vbCrLf

    End Sub

    Private Sub calcMean(ByVal list As List(of Double()), ByRef eastMeanVal As Double, ByRef northMeanVal As Double, ByRef depthMeanVal As Double)

        Dim eastSum As Double = 0
        Dim northSum As Double = 0
        Dim depthSum As Double = 0
        Dim i, j, k As Integer

      For Each d As Double() In list
            depthSum += d(2)  'getting the same return value. d(2) is not incrementing
           eastSum += d(1)  'getting the same return value. d(1) is not incrementing
          northSum += d(0)  'getting the same return value. d(0) is not incrementing
      Next
        eastMeanVal = eastSum /  3
        northMeanVal = northSum / 3
        depthMeanVal = depthSum / 3

    End Sub

And yet again you leave me speechless :) This is exactly what I was looking for. I now see that my problem was trying to treat the data as a traditional array. I'm starting to get the grasp of collections and am beginning to like them quite a bit. Thank you!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.