Sunday 14 October 2012

Trouble-shooting: Seg-fault (erroneous memory access) on glDrawElements()

Yesterday, when I was writing basic rendering code for the new ViRe, I keep getting errors of erroneous memory access (in the form of an Ada exception).

With the small code size I currently have, finding the offending line of source code is a matter of minutes. However, knowing that the exception is raised during the call to glDrawElements(), I still don't know the cause of it. My first suspect is the Ada OpenGL binding that I have been writing alongside the renderer.

Today, I wrote a small programme (~100 lines of Ada source text), set up everything that is needed and called glDrawElements(). To my surprise, it worked perfectly. The binding is working fine; it is the code that loads and prepares data that is in fault.

After some time, I found out that if you call glDrawElements on a VAO that is not ready to be rendered, be it a non-existent array object (including the id 0) or an uninitialised one, a segmentation fault will happen. By uninitialised, I mean the vertex buffer or index buffer is not generated, bind, or you forgot to load data into the index buffer, or you failed to enable the corresponding vertex attribute array, or any combination of them.

As long as index data is present, its values does not matter. It may cause an empty screen but not a memory access violation. Absence of vertex data does not affect rendering process.

Note that glDrawArrays() does not generate this error regardless of the circumstances. That makes sense, since it does not rely on index buffer object or index data.

Development environment:

  • OS: Linux Mint 13 with kernel 3.2.0-30-generic
  • Display Card : NVIDIA GT640M LE
  • Display Card driver: Bumblebee with NVIDIA binary driver 3.0.4.51

Sunday 26 August 2012

Setting Up Linux Mint on Fujitsu LH772

Recently, I bought a new notebook from Fujitsu, the LH772. Before the purchase, I tested that Ubuntu 12.04 supports the computer's function keys (mostly brightness and volume control), sound card and speakers. I decided to install Linux on it. There are a few problems, mostly regarding display card and wireless network, but I somehow am convinced that I can fix them or, at the least, work around them. This is the result of my research and journey.

The on-board hard disk drive is preloaded with an OEM Windows 7 installer. I don't want the Windows 7, mess up the boot loader or get caught installing a non-verified OS and void the warranty. So, I remove the drive and bought an SSD. The drive cover is secured by only two screws, the drive itself another two. Changing drive is very easy on the LH772.

I heard you will install Windows on first boot. I won't let you.
SSD speed test
555 MiB/s, about 10 times of what ordinary hard disk drives can achieve. I am impressed. SSDs' performance varies greatly among different data and usage conditions. Reviews and benchmarks has shown that SP900's read performance is between... 50MiB/s to 530 MiB/s, depending on data type and reading pattern.

Wireless network card does not work out of the box. If that is your only way to the internet, feel free to jump to the section about wireless network.

Display Card (NVIDIA GT640M LE)

The notebook features an NVIDIA GT640M LE display card, which "supports" Optimus Technology[Wikipedia], a hybrid display and power-saving solution developed by NVIDIA. Essentially, Optimus turns off the discrete display card when you don't need the processing power. Key benefits of using Optimus, according to NVIDIA, is:

  • completely automatic allowing you to experience longer battery life and amazing visuals without having to manually change settings;
  • behind the scenes and with no interference to what you're doing, Optimus seamlessly figures out how to best optimize your notebook computing experience.
The caveat is, the "automatic" and "no interference" part comes only if you are using Windows 7. If you look at the Wikipedia article, you will notice there is no Optimus support under Linux. That is, your discrete NVIDIA display card will run at full speed, draining battery life and overheating your computer. Ironically, at this stage, you cannot use the discrete card, even if you have driver installed.

NVIDIA. Damn. You.

So, if even official driver cannot help you, who can? The legendary Linux community. Introduce the Bumblebee Project, "a project aiming to support NVIDIA Optimus technology under Linux," as stated by the website.

Ubuntu has a ppa for Bumblebee. This simplifies installation a lot. However, I don't like the Unity interface and where Ubuntu is going; that's why I turn to Linux Mint, "Ubuntu done right" deemed by many. I was a Linux Mint Debian Edition (LMDE) user, but I failed to install Bumblebee on an older computer. I might try that again later.

To get Bumblebee work on Linux Mint 13 Maya, we need to add the ppa into repository. Open a terminal and type
sudo add-apt-repository ppa:bumblebee/stable
GT640M LT works only on the official driver version 304 (some say 302) or above, which is not provided by the default repositories, so we need to add another ppa
sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
If you did not add the x-swat ppa and got the wrong driver, just first uninstall everything about NVIDIA or Bumblebee: i.e. nvidia-settings, nvidia, nvidia-current, bumblebee and bumblebee-nvidia. bbswitch-dkms can stay as it doesn't depend on driver.

Then, install the package bumblebee. apt-get will get all the dependencies for you.

If the NVIDIA driver changes your X server settings and mess up your resolution, just delete /etc/X11/xorg.conf (or, to be save, just rename it). Modern display drivers and X server don't need the file.

Now if you try to turn on the discrete card by, like,
optirun glxspheres
you might get an error message like this:
[ 34.776889] [ERROR]Cannot access secondary GPU - error: Could not load GPU driver
[ 34.776919] [ERROR]Aborting because fallback start is disabled.

That's because a setting in bumblebee is wrong. Edit /etc/bumblebee/bumblebee.conf
sudo nano /etc/bumblebee/bumblebee.conf
Change the line
Driver=
to
Driver=nvidia
and in the section [driver-nvidia] change the line
KernelDriver=nvidia-current
to
KernelDriver=nvidia
After a reboot, optiun should work.

Wireless LAN Card (Ralink RT3290 PCIe)

Compared to display card, getting Wifi to work is much simpler. No, it's not.

Linux does not come with a driver for this wireless LAN card before version 3.9, so we have to *gasp* build it from source.

First, download the driver from Ralink MediaTek. If you don't want to tell them your e-mail address, just type a@a.a like I did.

Decompress the archive file to, say, ~/wireless_driver
tar -xf //2012_0508_RT3290_Linux_STA_v2.6.0.0.bz2 ~/wireless_driver
Before we can build the driver, we have to make a few changes to the source files.

Open ~/wireless_driver/os/linux/config.mk
Change line 27 from
HAS_NATIVE_WPA_SUPPLICANT_SUPPORT=n
to
HAS_NATIVE_WPA_SUPPLICANT_SUPPORT=y
Some say this will not change the compilation, but I did not test what will happen if it is leaved untouched as n.

In the same file, delete or comment out -DDBG on line 206. This will turn off debug mode. If debug mode is on, the driver will constantly display garbage message on dmesg, hiding real important information and messages. I have no idea why anyone will ship a driver with debug mode on as default.

Apparently, the developers have not tested how normal users use a driver. With -DDBG flag removed, the source will not compile. What we need to do is open ~/wireless_driver/sta/sta_cfg.c and delete line 4674
#endif /* DBG */
and line 4080
#ifdef DBG
to get the function definitions out of the ifdef block.

Now the source is ready. Change to the directory of the source in case you are not in it already:
cd ~/wireless_driver
If you have install this driver previously, clean the compiled and installed files before re-installation:
sudo make uninstall
sudo make clean

Compile and install:
sudo make
sudo make install

The driver should start automatically when the computer restarts.

IBus

As a daily Chinese user, I need a Chinese input method. I have been using IBus since Ubuntu introduce it at version... (I have to check it) 9.10. IBus works flawlessly once installed, except for Skype. The reason is that Skype uses QT4 library, but a normal installation of IBus on Mint doesn't include bindings (package ibus-qt4) for it. Start ibus (with its XIM server) by
ibus-daemon -x
Everything should work. If not, change the line
XIM=
to
XIM=ibus
in the file /etc/X11/xinit/xinput/default
Someone said that other files and options need to be set, but this is the one that did the magic for me.

Unresolved Problems

My Logitech M570 wireless trackball uses Logitech's Unifying receiver. If the computer boots with the receiver plugged in, the mouse will not work. Re-plugging sometimes help, removing then re-adding the hid_logitech_dj module using modprobe sometimes help. Nothing is clear to me now.

Function keys stop working after suspend. But if you have an SSD like I do, rebooting the computer takes only half a minute. So luckily it's not that big an issue.

Bluetooth is nowhere to be found except in lshw and lspci.

Sound level / mute setting and brightness setting usually won't survive a reboot.

The slowest mouse speed is still too fast for my 3500 dpi Razer DeathAdder.

Cinnamon is unstable with Hot corner enabled, particularly when scrolling fast in Firefox.

Update (29-8-2012)

I'm glad I have written this article. My SSD died just one day after I set everything up.

Update (20-7-2013)

Linux kernel 3.9 has out-of-the-box support for Ralink 3290.

Ralink has merged with MediaTek, and does not keep the old site accessible any more. Updated link accordingly.

Monday 30 July 2012

Android 4.1.1 Calendar Reminder Bug (or not)

When I upgraded to Android 4.1.1 Jelly Bean via OTA earlier this month, the bundled Calendar application was upgraded to version 4.1.1-398337. After this, a bug appears.

The bug only happens when default reminder time is set to None.

After adding a new event, everything seems fine. Go into the event screen, everything is normal. But when you exit to month view mode, or exit the calendar, then go back to see the event, two reminders are set up automatically for you. One for (Android) notification, one for e-mail, both 10 minutes before the event commence.(See second last paragraph)

If you set a default reminder time other than "None", The reminder will be normal: just one notification reminder of designated time.

Today, searched for the problem and found someone mentioned something with synchronisation. So, I logged in to Google Calendar on PC for the first time. Midway through the setup wizard is the default reminder setting. I immediately recognised some familiarities: a ten minute notification to phone and e-mail. I clear these two, save it, and create new events on my phone. No reminders, good.

Then I create a new event for my other Google account, which I have never logged in to calendar on PC before. Not surprisingly, the unexpected reminders are here again.

I changed the default reminder setting in Google account to 10 hours, then create a new event again on the phone. After syncing with the server, a 10-hour reminder appeared on the phone. Apparently, the initial 10-minute reminders came from the server, too. It looks like changing to month viewing mode or exiting the application will trigger the reminder generation, but it turns out it is just the delay of synchronisation.

There seems to be two independent reminder systems in the 4.1.1 Calender application: one on the phone, one from the server. There is no way on phone to change settings on server, and vice versa, thus the annoyance I faced.

Friday 29 June 2012

Anything that can go wrong...

Anything that can go wrong will go wrong, as stated by Murphy's Law.

My father's company has been developing a piece of application software. I do not know how much I can disclose, so I will just say that it is supposed to increase productivity, especially in the insurance industry. Anyway, this is just the background and it should not really matter.

Learnt that a contractor will charge a few hundred thousand dollars (Hong Kong Dollars) for the product, the company decided to develop it in-house. Not previously having a developing department, the company built one from ground up, which had just a handful of programmers. The project manager is an acquaintance of my father. He operated a software firm, which was defunct due to mismanagement.

The company based the department in Shenzhen, Mainland China to cut budget. Consequently, most of the programmers are Mainlanders.

In a few short meetings, my father explained to the project manager the functions and requirements of the application. Every week, the whole team would gather, reflect what they have done in the past week, update the schedule, and plan on what to do in the next. However, most of the meetings were useless: nothing was discussed because nothing is achieved in the previous week. Even if something is discussed, it is mostly complaints from the management level that the application does not conform to the requirements, and complaints from the development team that the company constantly changes requirements.

Two years later, in an expo, when the company was trying to promote the unfinished product and invite investors, they learnt that the application must be on-line. A set of data may even possibly be accessed simultaneously by different users, at different places. While keeping developing the current, single user version, the company started to develop a web-based version.

Soon after that, the single user version is completed, and is shipped to several insurance managers. Not until then did the company find out that the software is incompatible with Windows in Traditional Chinese (and every other language except English and Simplified Chinese). Apart from that, the programme is very buggy: marketed features do not function properly, and crashing is a usual visitor. As a result, the company cancelled it and focused on the on-line version, which was still in its infant stage.

Year after year, the development team worked on a moving target, and they saw no prosperity of either the project or themselves. As the new version slowly grew, developers started to leave. When the software is almost finished, the company lost all developers.

When the company wanted to hire new developers to continue the project, they faced a serious problem: they can hire no one. The web-based version of the software used ASP as their back-end; the company simply cannot find programmers that know ASP. Deemed that ASP is outdated anyway, the company decided to hire new programmers to re-implement everything in a newer language / platform: C# / .Net. Another two years have passed without anything new came out.

To date, few hundred thousand dollars are spent; the application is still remote from alpha stage.

Still holding hope in the product, the company wants to complete it collaboratively with a software firm. Luckily, the company still holds the old source code and the rights of it. Hoping to smooth transition, I went to Shenzhen earlier this month to see what is salvageable. The answer is, unsurprisingly, not much. Maybe most of the code base can be reused, since only the framework is done in the .NET version; however, without any documentation, it is very hard to understand what the code does. We do not even know which part works and which part does not: A month or two might be needed just to figure that out.

What went wrong?

Almost everything went wrong with this project, right from the beginning.

The first thing is totally underestimating the cost of the project. If you consult a professional software firm the cost of developing a piece of software, you should genuinely believe their answer. If you do not trust them, you can always consult another firm.

Once you convince yourself that price offered is reasonable, however, do not think you can lower it, by whatever means. If it takes a professional firm that much to build the system, it will certainly cost you more if you do it yourself. Remember, you find those firms for their professionalism; otherwise you would not find them in the first place. Sadly, many people do not think software development is a serious business and hence a professional field.

Of course, a software firm charge for is not the cost, but a price that will make them, as a company, profit. However, even considering this, you will still spend more money if you do it in-house, because you lack experience, knowledge, personnel, and professionalism.

Compared to paying a software firm, the cost of developing in-house will likely be a double, at least. Again, do not ever think you can lower the cost by any means, inducing but not limited to hiring developers at lower salaries, cutting some “unnecessary” job positions, and neglecting “superfluous” software engineering practices. Actually, all these practices are what make cost surge and are what you need to avoid (although not all that you need to avoid). Incidentally, the company did all three.

You do not want to pay low salaries to developers because they come in different grades: there are good programmers and bad programmers. Technicians and programmers in Mainland China are cheap for a reason. If you pay a high salary, no guarantee can you get to hire a good programmer. However, if you pay a low salary you are guaranteed to have a bad one. A bad programmer does not only lack the productivity of a good one; a bad programmer deprives productivity from others in the same team. Bad programmers might do something you would think to be unthinkable, such as refusing fixing bugs in their code, or introducing (not finding, but introducing) bugs in others' code. If anyone has dependencies on these codes, progress will be hindered.

Now imagine a team comprises just of bad programmers. This team will not only drag productivity of it to a lowest point, but also diminish efforts of future teams that work on this project or other related projects.

By neglecting “unnecessary” personnel and procedures, the company removed everything connected to quality assurance, and consequently everything related to quality. See how the company has her programmers create a programme that is buggy, and ignorant of foreign languages and character encoding, yet no one notice the problem until the product is shipped. When there is no full-time, dedicated quality assurance team, problems will slip through the development process and find their way to the customers.

You cannot rely on programmers to find bugs in their own code. Maybe they will find some, but that will not be comprehensive. When programmers write code, they think they have already considered every possible situation, which has been proved wrong so many times. Furthermore, they are preoccupied by many minor things already, such as requirements of client, constrain of hardware, software, and programming language. If you assign another job that requires attention and dedication to them, they will fail both.

Things do not end here. The development team did not ever use version control system or bug tracking system. Without these, there is no way to figure out when and where did the team introduce feature and bug into the code base. This makes development and maintenance very slow, if not impossible, on any programme beyond a few thousand lines of code.

One time, I was in horror when I saw the programme does not compile and a programmer uncomments a large chunk of code. Compilation was still not successful. The programmer made a dozen of changes and the programme still failed to compile. By that time, the source code was unrecognisable to what it was ten minutes ago. It was not until half an hour later did the programmer finally satisfy the compiler. The code compiled, but no one knew if it functions properly.

No one knew if the programme functions properly, because no one agreed how the programme should behave. The team lack documentation: specification, schedule, code comment, everything. Without a spec, the team will always work against a moving target and disagree on functionality. Without a detail schedule, works are guaranteed to crash through deadlines. Without code comment, it is impossible to maintain code, both for who write the code and who does not.

It is a miracle that the application got its current size.

Monday 12 March 2012

Competition

When BlenderGuru announced a new competition last month, I decide to give a shot and participate in it. One month has passed, my entry is still far from finished.I still have half of texturing, fluid simulation, dynamic painting, lighting and additional modelling to do. There is only one week until deadline. One day is need for solely rendering and tweaking nuance of lighting and camera angle, so I only have six days to do real work. For the next two weeks, I have two mid-term exam and three pieces of assignment due. I doubt if I can make it. Nevertheless, even if I cannot submit the entry on time, I will still post it here. Stay tuned.

Thursday 19 January 2012

Marble Hong Kong

I am not satisfied by the effect, but I will not work on this any more. My computer is choking on the scene so much that I can barely work on it.

Click on image for larger version.

I was trying to use the scratch marks as bump-maps, in hope of making the scratches more like scratches. Sadly, I don't know how to do it when using texture nodes. After few hours of messing around, I eventually gave up.

I don't think many will get what the picture is portraying, because of my bad skill, So I will explain it a bit. The marble pieces are Hong Kong Island and Lamma Island. I used them to represent Hong Kong, since I think that they are quite recognisable (which might actually not be). The red smoke on the left is a sandstorm, in red. The marble piece is being scratched and damaged by the sandstorm.