Avoiding Date Conversion When Pasting an HTML Table
From Get Data From a Website…, MD comments:
However, the info in the table I copy is formatted ## / ## whereby # represents a number.
So when it tries to paste 10 / 10 in the worksheet for example, it auto-changes to 10/okt in my sheet.
This is a common problem that’s hard to solve. There’s no setting I can find in Excel that tells it to stop converting things that look like dates into dates. Since I’m reading in the data and putting it in the clipboard, I can just message the data before I do it. That’s not so easy when you just want to copy and paste, but if you’re using code, you may find the technique useful.
In the above linked post, I automated Internet Explorer to login to a website. I don’t automate Internet Explorer any more, preferring XML instead. But it doesn’t matter which you use. It all ends up in an HTMLDocument, so it’s the same from there. Also, this example doesn’t log into a webpage. It uses Contextures’ Sample Data.
Dim xHttp As MSXML2.XMLHTTP
Dim hDoc As MSHTML.HTMLDocument
Dim hTable As MSHTML.HTMLTable
Dim hCell As MSHTML.HTMLTableCell
Dim doClip As MSForms.DataObject
‘Get the webpage
Set xHttp = New MSXML2.XMLHTTP
xHttp.Open "GET", "http://www.contextures.com/xlSampleData01.html"
xHttp.send
‘Wait for it to load
Do: DoEvents: Loop Until xHttp.readyState = 4
‘Put it in a document
Set hDoc = New MSHTML.HTMLDocument
hDoc.body.innerHTML = xHttp.responseText
‘Find the third table
Set hTable = hDoc.getElementsByTagName("table").Item(2)
‘Fix anything that looks like a date
For Each hCell In hTable.Cells
If IsDate(hCell.innerText) Then
hCell.innerText = "’" & hCell.innerText
End If
Next hCell
‘put it in the clipboard
Set doClip = New MSForms.DataObject
doClip.SetText "<html>" & hTable.outerHTML & "</html>"
doClip.PutInClipboard
‘paste it to the sheet
Sheet1.Select
Sheet1.Range("A1").Select
Sheet1.PasteSpecial "Unicode Text"
‘Make the leading apostrophes go away
Sheet1.Range("A1").CurrentRegion.Value = Sheet1.Range("A1").CurrentRegion.Value
End Sub
Once I get the table into an HTMLTable object, I loop through all the HTMLTableCells to see if any of them looks like a date. If so, I put an apostrophe in front of it. The apostrophe is the Excel way to say “No matter what I type next, assume it’s text”. Except when you’re pasting special as Unicode Text. In that case, it doesn’t hide the apostrophe like it should. So the last line of the code is the equivalent of pressing F2 and Enter all the say down the column and forcing Excel to reevaluate its decision to ignore that apostrophe.