Ben Biddington

Whatever it is, it's not about "coding"

Posts Tagged ‘programming

SSH, cygwin and domain users

leave a comment »

Yes you can log in to your local computer via ssh with a domain account.

If it seems you can’t (i.e., your password is rejected) then you  most likely need to export your user accounts and groups so cygwin can see them.

Another clue that you need to export is if you get a message like:

Your group is currently "mkpasswd".  This indicates that
the /etc/passwd (and possibly /etc/group) files should be rebuilt.
See the man pages for mkpasswd and mkgroup then, for example, run
mkpasswd -l [-d] > /etc/passwd
mkgroup  -l [-d] > /etc/group
Note that the -d switch is necessary for domain users.

To export domain users:

$ mkpasswd -d >> /etc/passwd

To export groups:

$ mkgroup > /etc/group

Troubleshooting

Errors logging in as domain user

2 [main] -bash 31884 C:\cygwin\bin\bash.exe: *** fatal error - couldn't dynamically determine load address for 'WSAGetLastError' (handle 0xFFFFFFFF), Win32 error 126
Connection to localhost closed.

This is because the cygwin sshd service must also run as domain account. I solved this by changing the user to my domain account.

Advertisements

Written by benbiddington

4 August, 2010 at 13:37

Closure

leave a comment »

Closures have been floating around lately, cropping up in Ruby as blocks and procs, as well as pure functional languages.

[A closure] is a first-class function with free variables. Such a function is said to be “closed over” its free variables. A closure is defined within the scope of its free variables, and the extent of those variables is at least as long as the lifetime of the closure itself. The explicit use of closures is associated with functional programming and with languages such as MLLisp and Perl. Closures are used to implement continuation passing style, and in this manner, hide state. Constructs such as objects and monads can thus be implemented with closures.

Free variables

A free variable specifies a place holder in an expression. Whether it is bound or free depends on where it is declared with respect to the expression.

It is approximately correct to say:

  • A variable is free if you can substitute a value for it and the resulting expression is meaningful.
  • A variable is bound if the expression is a statement about all the possible values of the variable all at once.  A bound variable is bound by an operator such as the integral sign, a quantifier, or a summation sign.

Or:

  • [A free variable is] An occurrence of a variable in a logic formula which is not inside the scope of a quantifier.
  • [A bound variable] In logic, [is] a variable that occurs within the scope of a quantifier, and cannot be replaced by a constant.

Example:

This expression is considered bound in x, and free in y. This expression holds for all values of x between the limits, but y can take only one value. The variable y stands for a fixed value, not specified inside the expression, while x is bound by the expression definition.

In mathematical examples, it is often mentioned that bound variables cannot be replaced with a constant, otherwise a meaningless expression would result — try replacing x with 1 in the integral above. Clearly d1 doesn’t make any sense.

Interestingly, as shown in the example, this can be applied to language:

Satomi found her book.

In this expression, the pronoun her is ambiguous. It may refer to Satomi, or any female declared outside the current context, or scope. The variable her is free.

A variable in an expression is either free or bound.  It is approximately correct to say:
¨  A variable is free if you can substitute a value for it and the resulting expressions is meaningful.
¨  A variable is bound if the expression is a statement about all the possible values of the variable all at once.  A bound variable is bound by an operator such as the integral sign, a quantifier, or a summation sign.

Free variables in computer programming

In computer programming, a free variable is a variable referred to in a function that is not local (not declared within the scope of the function), or an argument of that function. An upvalue is a free variable that has been bound (closed over) with a closure.

So, in terms of a closure, a free variable is any variable in scope that is declared outside the closure itself, and is not supplied as an argument. By contrast, arguments and locals are always bound.

Closed over free variables

A closure is said to be closed over its free variables, what does that mean? This means completed by. A closure expression is completed by specifying values for its free variables.

[TBD: hoisting — what a complier emits for free variables.

Closures in ruby

Though blocks are like closures in that they’re closed over their free variables, they’re not closures because they’re not really first class functions — a block cannot be passed around like an object.

A block can be converted to a proc, though. Capture a block as a proc using ampersand:

class Simple
    attr_reader :saved_block

    def initialize()
        yield self if block_given?
    end

    def save_block_for_later(&proc)
        @saved_block = proc
    end
end

And the proc can be assigned like:

var_x= 'x'
simple = Simple.new
simple.save_block_for_later { puts "The current value for var_x = '#{var_x}'."}

This closure can then be invoked at a later time — still bound to its free variables — using call:

simple.saved_block.call

Which prints the text:

"This one has var_x defined as 'x'."

And the same as usual to supply arguments:

simple.save_block_for_later do |an_argument|
    puts "The current value for var_x = '#{var_x}', " +
        "and an_argument has been supplied as '#{an_argument}'."
end

simple.saved_block.call 'xxx'

Funargs

The funarg problem — how to manage variable scoping when dealing with first-class functions.

Stack frames and locals

Traditionally, local variable scope is managed using stack frames.

The idea behind a stack frame is that each subroutine can act independently of its location on the stack, and each subroutine can act as if it is the top of the stack.

When a function is called, a new stack frame is created at the current esp location. A stack frame acts like a partition on the stack. All items from previous functions are higher up on the stack, and should not be modified. Each current function has access to the remainder of the stack, from the stack frame until the end of the stack page. The current function always has access to the “top” of the stack, and so functions do not need to take account of the memory usage of other functions or programs.

In short, functions are allocated temporary storage in a stack frame. This frame stores arguments and local variables. The frame is allocated before the function call, and cleaned up at function exit. The problem arises when a function returns another function.

Normally all local variables are removed with the stack frame, however if a function is returned that references locals, i.e.,  a closure, then these variables have to be kept alive.

CPU registers

[In computer architecture], a processor register is a small amount of storage available on the CPU whose contents can be accessed more quickly than storage available elsewhere.

  • ESP: stack pointer for top address of the stack
  • EBP: stack base pointer for holding the address of the current stack frame

In terms of functions, this article describes the roles of the EBP and ESP registers. The ESP register marks the top of the stack

References

Written by benbiddington

22 June, 2009 at 21:24

Posted in development

Tagged with ,

IDisposable and unmanaged memory

leave a comment »

My pair and I had to implement IDisposable the other day, and I had almost forgotten how and why it is done the way it is, so I thought I’d make some notes. An exceptionally clear summary can be found in section 9.3 of Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries, which I have used as the basis.

Objects that:

  1. Contain references to unmanaged resources, i.e., objects that don’t have finalizers. These types of objects should also define a finalizer.
  2. — or– contain references to disposable objects.

should always implement IDisposable. Disposable objects offer clients a way to free resources deterministically, rather than whenever the CLR deems it necessary.

Here is a class that contains a simple implementation. It includes a finalizer because it contains a reference to an unmanaged object that doesn’t have its own.

public class UnmanagedResourceHolder : IDisposable {
    IntPtr buffer; // An unmanaged resource
    SafeHandle managedResource;

    public UnmanagedResourceHolder () {
        this.buffer = ... // init buffer
        this.managedResource = ...
    }

    public void Dispose() {
        Dispose(true);

        // Only suppress if Dispose(true) has completed successfully
        // to ensure finalizer gets a chance
        GC.SuppressFinalize(this);
    }

    public ~UnmanagedResourceHolder() {
        Dispose(false);
    }

    protected virtual void Dispose(Boolean disposing) {
        // Can't find reference for the following, assume it's self-explanatory...
        ReleaseBuffer(buffer);

        if (disposing) {
            // Run deterministic cleanup
            if (managedResource != null) {
                managedResource.Dispose();
            }
        }
    }
}

Points to note:

  • Unmanaged resources released on both paths. This ensures deterministic cleanup is available as well as finalizer cleanup.
  • Managed resources are not released during finalizer. This is because managedResource is managed — it will handle its own finalization, plus the next reason.
  • During finalization, (normally valid) assumptions about the internal state of an object are no longer reliable. Finalization occurs in an unpredictable order — for example, the managedResource field may have already been finalized.
  • Provided Dispose() is called, finalization is skipped (though there is still overhead, see below).
  • It is a good idea to provide a protected virtual Dispose to allow derived types to perform their own cleanup.
  • Always invoke super type’s Dispose (if there is one) — for obvious reasons — when overriding in derived type.

A connection pool example

Why is it important to close database connections? Here’s what happens when connection is not explicitly closed:

[Trace]
Audit Login		-- network protocol: TCP/IP
SQL:BatchStarting	SELECT count(1) from User
SQL:BatchCompleted	SELECT count(1) from User
Audit Logout

Here’s what happens when a connection is closed (or finalized):

[Trace]
Audit Login		-- network protocol: TCP/IP...
SQL:BatchStarting	SELECT count(1) from User
SQL:BatchCompleted	SELECT count(1) from User
Audit Logout
RPC:Completed		exec sp_reset_connection

Identical, except that sp_reset_connection is invoked at the end.

In both cases, the connection remains sleeping (process is waiting for a lock or user input):

login_time last_batch hostname cmd status
2009-06-15 09:17:29.590 BENB AWAITING COMMAND sleeping

This behaviour is part of ADO.NET connection pooling. Connections remain ready like this until they are considered surplus (and removed from the pool), or the application exits. You can prove this easily enough yourself, quit your test fixture and then requery your connection state.

It is, therefore, important to close connections from an ADO.NET pooling standpoint. In order to make the in-memory connection available again.

If Open is invoked on a database connection, and there are no free connections available, an InvalidOperationException results with an error message like:

Timeout expired.  The timeout period elapsed prior to obtaining
a connection from the pool. This may have occurred because all pooled
connections were in use and max pool size was reached.

Querying connection states

Examine connections in SqlServer using master.db.sysprocesses:

select login_time, last_batch, hostname, cmd, status
from master.dbo.sysprocesses with(nolock)
where dbid = DB_ID('PersonalWind')

Finalizers

Finalizers are only for unmanaged resources. A finalizer provides a mechanism for releasing unmanaged resources when clients omit explicit disposal. Finalization occurs before the garbage collector reclaims managed memory, and is the last chance for objects to release unmanaged resources.

[MSDN, Object Lifetime: How Objects Are Created and Destroyed] The garbage collector in the CLR does not (and cannot) dispose of unmanaged objects, objects that the operating system executes directly, outside the CLR environment. This is because different unmanaged objects must be disposed of in different ways. That information is not directly associated with the unmanaged object; it must be found in the documentation for the object. A class that uses unmanaged objects must dispose of them in its Finalize method.

Though useful in certain circumstances, finalizers are notoriously difficult to implement, and incur real overhead:

  • [MSDN] When allocated, finalizable objects are added to a finalization list. When these instances are no longer reachable and the GC runs, they’re moved to the “FReachable” queue, which is processed by the finalizer thread. Suppressing finalization with GC.SuppressFinalize sets a “do not run my finalizer” flag in the object’s header, such that the object will not get moved to the FReachable queue by the GC. As a result, while minimal, there is still overhead to giving an object a finalizer even if the finalizer does nothing or is suppressed.
  • When the CLR needs to call a finalizer, it postpones reclamation of managed memory until the next round. This means finalizable objects are longer-lived — they use memory for longer.

Non-determinism

There is no way to predict when a finalizer will be called, because CLR decides when to reclaim memory based dynamically at runtime. Garbage collection is an expensive exercise, and is minimized by design, so memory can persist long after the variables that reference it have dropped out of scope. This may be unacceptable for some systems. Database connection pooling is a prime example of this. Failure to release connections by closing them when they’re no longer required quickly cripples a system.

References

Written by benbiddington

15 June, 2009 at 21:01

TeamCity — rake runner aborts early

with 3 comments

How to run TeamCity rakerunner locally

It is important to get the working directory right, otherwise you’ll get a bunch of require errors, for example:

C:\ruby\bin/ruby.exe C:\BuildAgent\plugins\rake-runner\lib\rb\runner\rakerunner.rb

Fails with error:

C:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require’: no such file to load — teamcity/rakerunner_consts (LoadError)
from C:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require’
from C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:20
no such file to load -- teamcity/rakerunner_consts (LoadError)

from C:/ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'

from C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:20

rake_ext is including a file with path:

require 'teamcity/rakerunner_consts'

And we know require works relative to the working directory which means our working directory must be set to:

C:\BuildAgent\plugins\rake-runner\lib\rb\patch

Rake abort

TeamCity rakerunner exits with message “Rake aborted!” as soon as it encounters a non-zero exit code from any task. Any subsequent tasks are therefore skipped.

This is not desirable behaviour for us because we have reporting tasks that must run always.

Stack trace showing the overridden members:

C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:158:in 'process_exception'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:95:in 'target_exception_handling'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:266:in 'execute'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:578:in 'invoke_with_call_chain'
C:/ruby/lib/ruby/1.8/monitor.rb:242:in 'synchronize'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:571:in 'invoke_with_call_chain'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:564:in 'standard_invoke_with_call_chain'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:235:in 'invoke'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:90:in 'target_exception_handling'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:234:in 'invoke'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:2027:in 'invoke_task'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:2005:in 'top_level'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:2005:in 'each'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:2005:in 'top_level'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:311:in 'standard_exception_handling'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:1999:in 'top_level'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:1977:in 'run'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:311:in 'standard_exception_handling'
C:/ruby/lib/ruby/gems/1.8/gems/rake-0.8.4/lib/rake.rb:1974:in 'run'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rake_ext.rb:179:in 'run'
C:/BuildAgent/plugins/rake-runner/lib/rb/runner/rakerunner.rb:40

Here’s what rake Application.top_level looks like:

# Rake lib
def top_level
    standard_exception_handling do
        if options.show_tasks
            display_tasks_and_comments
        elsif options.show_prereqs
            display_prerequisites
        else
            top_level_tasks.each { |task_name| invoke_task(task_name) }
        end
    end
end

The TeamCity version has overridden standard_exception_handling:

# TeamCity Rakerunner
def standard_exception_handling
    begin
        yield
    rescue Rake::ApplicationAbortedException => app_e
        raise
    rescue Exception => exc
        Rake::TeamCityApplication.process_exception(exc)
    end
end

So the yield there is yielding to the result of executing each top level task. The rake task implementation has an execute method like:

# TeamCity Rakerunner
def execute(*args, &block)
    standard_execute_block = Proc.new do
        standard_execute(*args, &block)
    end

    if application.options.dryrun
        Rake::TeamCityApplication.target_exception_handling(
            name, true, "(dry run)", &standard_execute_block)
    else
        Rake::TeamCityApplication.target_exception_handling(
            name, true, &standard_execute_block)
    end
end

Where Rake::TeamCityApplication.target_exception_handling raises Rake::ApplicationAbortedException. We can disable this behaviour by commenting it out.

Solution

The Quick solution is to comment out the line that raises the exception:

# rake_ext line 158, method TeamCityApplication.process_exception
raise Rake::ApplicationAbortedException, exc

The problem now though, is that a failing task will not be reported automatically.

In this project that’s fine because we’re doing this ourselves in a subsequent step, but for other projects using the rakerunner this may prove confusing.

Written by benbiddington

18 May, 2009 at 12:19

Rake — shell task failure

leave a comment »

For my current project, we’re running cucumber as one of our tasks. Our report generation runs afterwards — which we need to happen regardless of cucumber’s exit code (cucumber exits with code 1 when any tests fail).

Handling shell task failure

We noticed that if cucumber fails, rake exits immediately — without running our subsequent report tasks.

We’re running cucumber with rake’s ruby function — which runs the ruby interpreter in its own process by invoking a shell command.

Pseudo-stacktrace (irrelevant source lines omitted):

Shell::CommandProcessor.system(command, *opts)
Rake::Win32.rake_system(*cmd)
FileUtils.sh(*args, &block)
FileUtils.ruby(*args, &block)

The key to this is the behaviour of rake’s FileUtils.sh. Examining the source shows:

If it is not supplied a block, and the shell command fails, and rake exits with error status.

Aside: For some reason, Kernel.system echoes errors.

Example

This task will cause rake to exit immediately with status 1:

task 'this-one-fails' do
    ruby("xxx")
end

while the following task does not cause rake to terminate  normally, any subsequent task will still run:

path = File.expand_path(File.dirname(__FILE__) + ‘/output/cucumber-manual.txt’)
cuke_path = ‘C:/ruby/lib/ruby/gems/1.8/gems/cucumber-0.3.3/bin/cucumber’
ruby(“#{cuke_path} -h > ‘#{path}'”) do |success, exit_code|
puts “Exited with code: #{exit_code}” if !success
end
task 'this-one-succeeds' do
    ruby("xxx") do |success, exit_code|
        puts "Exited with code: #{exit_code.exitstatus}" unless success
    end
end

Getting the last exit status

The $? variable contains the status of the last child process to terminate. This may be useful if steps subsequent to the failed one require this information.

Written by benbiddington

16 May, 2009 at 16:45

Cucumber — ANSI formatting

leave a comment »

When cucumber output is piped to file on disk, it may contain non-printable characters and markup, for example:

[32m6 passed steps

Actually looks like this when output to console:

6 passed steps

This is due to cucumber emitting Ansi formatting.

[ANSI escape sequences] are used to control text formatting and other output options on text terminals.

Most of these escape sequences start with the characters ESC (ASCII decimal 27/hex 0x1B/octal 033) and [ (left bracket). This sequence is called CSI for Control Sequence Introducer(or Control Sequence Initiator). There is a single-character CSI (155/0x9B/0233) as well.

The ESC+[ two-character sequence is more often used than the single-character alternative, for details see C0 and C1 control codes. Devices supporting only ASCII (7-bits), or which implement 8-bit code pages which use the 0x80–0x9F control character range for other purposes will recognize only the two-character sequence. Though some encodings use multiple bytes per character, in this topic all characters are single-byte…

This formatting can be suppressed using the:

--[no-]color

switch.

Written by benbiddington

12 May, 2009 at 13:45

Ruby require

with 2 comments

While diagnosing problems with a ruby application, we discovered we had to move certain folders around so the interpreter could find them. Clearly we were missing something fundamental.

Require: definition

Inspecting the source (v1.9) for require shows:

Ruby tries to load the library named _string_, returning
 *  +true+ if successful. If the filename does not resolve to
 *  an absolute path, it will be searched for in the directories listed
 *  in <code>$:</code>. If the file has the extension ``.rb'', it is
 *  loaded as a source file; if the extension is ``.so'', ``.o'', or
 *  ``.dll'', or whatever the default shared library extension is on
 *  the current platform, Ruby loads the shared library as a Ruby
 *  extension. Otherwise, Ruby tries adding ``.rb'', ``.so'', and so on
 *  to the name. The name of the loaded feature is added to the array in
 *  <code>$"</code>. A feature will not be loaded if it's name already
 *  appears in <code>$"</code>. However, the file name is not converted
 *  to an absolute path, so that ``<code>require 'a';require
 *  './a'</code>'' will load <code>a.rb</code> twice.

Require does not canonicalize paths

Here’s the relevant part of the comment:

However, the file name is not converted
to an absolute path, so that
require 'a';
require './a'
will load a.rb twice.

So, as described, ruby treats require statements literally: a different path is a different file, even if it’s a different path to the same file. And this may result in a file being loaded twice.

This can have confusing side effects. We encountered this in one of our cucumber projects — after scenario handlers were firing twice and we didn’t know why.

Require does not expand paths

Another implication of this behaviour is that an application may not run as expected unless the correct working directory is used.

Adding to the $” list does not work

I thought another option may be adding items directly to the $” list. I guess this shows that require is doing more than just inserting an item in a list.

Adding to $” is in fact not enough, demonstrated by the fact that this works:

require file_to_include
$”.unshift(file_to_include)
require file_to_include
$".unshift(file_to_include)

While the reverse order does not:

$".unshift(file_to_include)
require file_to_include

This proves that require has exited early because the file already exists in $” – and hence the include fails.

Solution

Fortunately there is a way around both of these problems.

Rather than require relative paths:

require '../lib/application'

absolute paths may be specified:

File.expand_path(File.dirname(__FILE__) + '/../lib/application')

This means that:

  1. Working directory is no longer relevant. The second example ensures that the required path is relative the the file that requires it.
  2. Duplicate requires are not possible

EDIT, 2009-05-12: Also consider require_relative, added in v1.9

Written by benbiddington

11 May, 2009 at 13:00

Posted in development

Tagged with , , ,